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CELL CYCLE REGULATORY PROTEINS CDC-16. DP-1. DP-2 AND E2F FROM PLANTS 

This application claims the benefit of U.S. Provisional Application 
No. 60/081,132, filed April 9, 1998. 

FIELD OF THF. INVENTION 

This invention is in the field of plant molecular biology. More specifically, this 
invention pertains to nucleic acid fragments encoding cell cycle regulatory proteins in plants 
and seeds. 

BACKGROUND OF THE INVENTION 
Cells divide by duplicating their chromosomes and segregating one copy of each 
duplicated chromosome, as well as providing essential organelles, to each of two daughter 
cells. Regulation of cell division is critical for the normal development of multicellular 
organisms. A cell that is destined to grow and divide must pass through specific phases of a 
cell cycle: G,, S (period of DNA synthesis), G 2 , and M (mitosis). Studies have shown that 
cell division is controlled via the regulation of two critical events during the cell cycle: 
initiation of DNA synthesis and the initiation of mitosis. Several kinase proteins control cell 
cycle progression through these events. These protein kinases are heterodimeric proteins, 
having a cyclin-dependent kinase (Cdks) subunit and a cyclin subunit that provides the 
regulatory specificity to the heterodimeric protein. These heterodimeric proteins regulate 
cell cycle by interacting with proteins involved in the initiation of DNA synthesis and 
mitosis and phosphorylating them at specific regulatory sites, activating some and 
inactivating others. The cyclin subunit concentration varies in phase with cell cycle while 
the concentration of the Cdks remain relatively constant throughout the cell cycle. 

In mammalian cells transcription factor E2F genes encode a family of proteins that 
bind to the nucleotide sequence TTTCGCGC and regulate the expression of various cellular 
and viral promoters. Another transcription factor family of proteins, DP-1 and DP-2, can 
form heterodimers with E2F proteins in vivo (Helin, K. et al. (1994) Genes Dev. 
70:1850-1861; Ivey-Hoyl M. et al. (1993) Mol. Cell Biol. JJ(2):7802-7812; and Zhang, 
Y. et al. (1995) Oncogene 10(11) :2085-2093). The E2F-DP transcription factors are major 
regulators of genes that are required for the progression of S-phase, such as DHFR and DNA 
polymerase alpha, and they play a critical role in cell cycle regulation and differentiation. 
The retinoblastoma tumor suppressor protein has been shown to induce growth arrest by 
binding to E2F-DP and repressing its activity. Lastly, CDC-16 is cell cycle protein 
identified in mammalian and yeast cells. This protein has been shown to localize to the 
centrosome and mitotic spindle and is essential for the metaphase to anaphase transition 
during cell division (Lamb, J. R. et al. (1994) EMBOJ. 7J(/S):4321-4328). 

There is a great deal of interest in identifying the genes that encode cell cycle 
regulatory proteins in plants. These genes may be used to express cell cycle regulatory 
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nucleic acid sequences encoding all or a portion of a cell cycle regulatory protein would 
• facilitate studies to better understand cell cycle regulation in plants, provide genetic tools to 
enhance cell growth in tissue culture, increase the efficiency of gene transfer and help 
provide more stable transformations. Cell cycle regulatory proteins may also provide targets 
5 to facilitate design and/or identification of inhibitors of cell cycle regulatory proteins that 

may be useful as herbicides. 

. SUMMARY OF THE INVENTION 
The instant invention relates to isolated nucleic acid fragments encoding cell cycle 
regulatory proteins. Specifically, this invention concerns an isolated nucleic acid fragment 
10 encoding a CDC-16, DP-1, DP-2 or E2F protein. In addition, this invention relates to a 
nucleic acid fragment that is complementary to the nucleic acid fragment encoding a 
CDC-16, DP-1, DP-2 or E2F protein. 

An additional embodiment of the instant invention pertains to a polypeptide encoding 
all or a substantial portion of a cell cycle regulatory protein selected from the group 
15 consisting of CDC-16, DP-1, DP-2 and E2F. 

In another embodiment, the instant invention relates to a chimeric gene encoding a 
CDC-16, DP-1, DP-2 or E2F protein, or to a chimeric gene that comprises a nucleic acid 
fragment that is complementary to a nucleic acid fragment encoding a CDC-16, DP-1 , DP-2 
or E2F protein, operably linked to suitable regulatory sequences, wherein expression of the 
20 chimeric gene results in production of levels of the encoded protein in a transformed host 
cell that is altered (i.e., increased or decreased) from the level produced in an untransformed 
host cell. 

In a further embodiment, the instant invention concerns a transformed host cell 
comprising in its genome a chimeric gene encoding a CDC-16, DP-1, DP-2 or E2F protein, 

25 operably linked to suitable regulatory sequences. Expression of the chimeric gene results in 
production of altered levels of the encoded protein in the transformed host cell. The 
transformed host cell can be of eukaryotic or prokaryotic origin, and include cells derived 
from higher plants and microorganisms. The invention also includes transformed plants that 
arise from transformed host cells of higher plants, and seeds derived from such transformed 

30 plants. 

An additional embodiment of the instant invention concerns a method of altering the 
level of expression of a CDC- 1 6, DP- 1 , DP-2 or E2F protein in a transformed host cell 
comprising: a) transforming a host cell with a chimeric gene comprising a nucleic acid 
fragment encoding a CDC-16, DP-1, DP-2 or E2F protein; and b) growing the transformed 
35 host cell under conditions that are suitable for expression of the chimeric gene wherein 
expression of the chimeric gene results in production of altered levels of CDC-16, DP-1 , 
DP-2 or E2F protein in the transformed host cell. 
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An addition embodiment of the instant invention concerns a method for obtaining a 
nucleic acid fragment encoding all or a substantial portion of an amino acid sequence 
encoding a CDC- 16, DP-1, DP-2 or E2F protein. 

A further embodiment of the instant invention is a method for evaluating at least one 
5 compound for its ability to inhibit the activity of a CDC- 16, DP-1, DP-2 or E2F protein, the 
method comprising the steps of: (a) transforming a host cell with a chimeric gene 
comprising a nucleic acid fragment encoding a CDC- 1 6, DP- 1 , DP-2 or E2F protein, 
operably linked to suitable regulatory sequences; (b) growing the transformed host cell under 
conditions that are suitable for expression of the chimeric gene wherein expression of the 
10 chimeric gene results in production of CDC-1 6, DP-1 , DP-2 or E2F protein in the 

transformed host cell; (c) optionally purifying the CDC- 16, DP-1, DP-2 or E2F protein 
expressed by the transformed host cell; (d) treating the CDC-1 6, DP-1, DP-2 or E2F protein 
with a compound to be tested; and (e) comparing the activity of the CDC- 16, DP-1 , DP-2 or 
E2F protein that has been treated with a test compound to the activity of an untreated 
15 CDC-1 6, DP-1, DP-2 or E2F protein, thereby selecting compounds with potential for 
inhibitory activity. 

BRIEF DESCRIPTION OF THE SEQUENCE LISTINGS 
The invention can be more fully understood from the following detailed description 
and the accompanying Sequence Listing which form a part of this application. 
20 The following sequence descriptions and Sequence Listing attached hereto comply 

with the rules governing nucleotide and/or amino acid sequence disclosures in patent 
applications as set forth in 37 C.F.R. §1.821-1.825. 

SEQ ID NO:l is the nucleotide sequence comprising a portion of the cDNA insert in 
clone cpflc.pk001.kl3 encoding a portion of a corn CDC- 16 protein. 
25 SEQ ID NO:2 is the deduced amino acid sequence of a portion of a CDC-1 6 protein 

derived from the nucleotide sequence of SEQ ID NO: 1 . 

SEQ ID NO:3 is the nucleotide sequence comprising a portion of the cDNA insert in 
clone sfll.pk0030.e3 encoding a portion of a soybean CDC- 16 protein. 

SEQ ID NO:4 is the deduced amino acid sequence of a portion of a CDC- 16 protein 
30 derived from the nucleotide sequence of SEQ ID NO:3. 

SEQ ID NO:5 is the nucleotide sequence comprising a portion of the cDNA insert in 
clone ids.pk0025.f7 encoding a portion of an Impatiens balsamia DP-1 protein. 

SEQ ID NO:6 is the deduced amino acid sequence of a portion of a DP-1 protein 
derived from the nucleotide sequence of SEQ ID NO: 5. 
35 SEQ ID NO:7 is the nucleotide sequence comprising a portion of the cDNA insert in 

clone p0072.comfs64r encoding a portion of a corn DP-1 protein. 

SEQ ID NO:8 is the deduced amino acid sequence of a portion of a DP-1 protein 
derived from the nucleotide sequence of SEQ ID NO:7. 

3 
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SEQ ID NO:9 is the nucleotide sequence comprising a portion of the cDNA insert in 
clone src3c.pk018.ml 1 encoding a portion of a soybean DP-1 protein. 

SEQ ID NO: 10 is the deduced amino acid sequence of a portion of a DP-1 protein 
derived from the nucleotide sequence of SEQ ID NO:9. 
5 SEQ ID NO: 1 1 is the nucleotide sequence comprising a contig assembled from the 

cDNA inserts in clones p0005.cbmfh22r, cdelc.pk001.jl3 and cen3n.pk0183.b9 encoding a 
portion of a corn DP-2 protein. 

SEQ ID NO: 12 is the deduced amino acid sequence of a portion of a DP-2 protein 
derived from the nucleotide sequence of SEQ ID NO:l 1 . 
10 SEQ ID NO: 13 is the nucleotide sequence comprising the entire cDNA insert in clone 

wlmkl.pk0005.e2 encoding a portion of a wheat DP-2 protein. 

SEQ ID NO: 14 is the deduced amino acid sequence of a portion of a DP-2 protein 
derived from the nucleotide sequence of SEQ ID NO: 13. 

SEQ ID NO: 15 is the nucleotide sequence comprising a portion of the cDNA insert in 
15 clone rslln.pk004.dl5 encoding a portion of a rice E2F protein. 

SEQ ID NO: 16 is the deduced amino acid sequence of a portion of an E2F protein 
derived from the nucleotide sequence of SEQ ID NO: 15. 

SEQ ID NO: 17 is the nucleotide sequence comprising the entire cDNA insert in clone 
sel.pk0012.f4 encoding a portion of a soybean E2F protein. 
20 SEQ ID NO: 18 is the deduced amino acid sequence of a portion of an E2F protein 

derived from the nucleotide sequence of SEQ ID NO: 17. 

The Sequence Listing contains the one letter code for nucleotide sequence characters 
and the three letter codes for amino acids as defined in conformity with the IUPAC-IUBMB 
standards described in Nucleic Acids Research 73:3021-3030 (1985) and in the Biochemical 
25 Journal 219 (No. 2/:345-373 (1984) which are herein incorporated by reference. The 

symbols and format used for nucleotide and amino acid sequence data comply with the rules 
set forth in 37 C.F.R. §1.822. 

DETAILED DESCRIPTION OF THE INVENTION 
In the context of this disclosure, a number of terms shall be utilized. As used herein, 
30 an "isolated nucleic acid fragment" is a polymer of RNA or DNA that is single- or double- 
stranded, optionally containing synthetic, non-natural or altered nucleotide bases. An 
isolated nucleic acid fragment in the form of a polymer of DNA may be comprised of one or 
more segments of cDNA, genomic DNA or synthetic DNA. As used herein, "contig" refers 
to an assemblage of overlapping nucleic acid sequences to form one contiguous nucleotide 
35 sequence. For example, several DNA sequences can be compared and aligned to identify 
common or overlapping regions. The individual sequences can then be assembled into a 
single contiguous nucleotide sequence. 

4 
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As used herein, "substantially similar" refers to nucleic acid fragments wherein 
changes in one or more nucleotide bases results in substitution of one or more amino acids, 
but do not affect the functional properties of the protein encoded by the DNA sequence. 

"Substantially similar" also refers to nucleic acid fragments wherein changes in one or 
5 more nucleotide bases does not affect the ability of the nucleic acid fragment to mediate 
alteration of gene expression by antisense or co-suppression technology. "Substantially 
similar" also refers to modifications of the nucleic acid fragments of the instant invention 
such as deletion or insertion of one or more nucleotides that do not substantially affect the 
functional properties of the resulting transcript vis-a-vis the ability to mediate alteration of 
10 gene expression by antisense or co-suppression technology or alteration of the functional 
properties of the resulting protein molecule. It is therefore understood that the invention 
encompasses more than the specific exemplary sequences. 

For example, it is well known in the art that antisense suppression and co-suppression 
of gene expression may be accomplished using nucleic acid fragments representing less than 
15 the entire coding region of a gene, and by nucleic acid fragments that do not share 100% 
sequence identity with the gene to be suppressed. Moreover, alterations in a gene which 
result in the production of a chemically equivalent amino acid at a given site, but do not 
effect the functional properties of the encoded protein, are well known in the art. Thus, a 
codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon 
20 encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, 
such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one 
negatively charged residue for another, such as aspartic acid for glutamic acid, or one 
positively charged residue for another, such as lysine for arginine, can also be expected to 
produce a functionally equivalent product. Nucleotide changes which result in alteration of 
25 the N-terminal and C-terminal portions of the protein molecule would also not be expected 
to alter the activity of the protein. Each of the proposed modifications is well within the 
routine skill in the art, as is determination of retention of biological activity of the encoded 
products. 

Moreover, substantially similar nucleic acid fragments may also be characterized by 
30 their ability to hybridize, under stringent conditions (0.1X SSC, 0.1% SDS, 65°C), with the 
nucleic acid fragments disclosed herein. 

Substantially similar nucleic acid fragments of the instant invention may also be 
characterized by the percent similarity of the amino acid sequences that they encode to the 
amino acid sequences disclosed herein, as determined by algorithms commonly employed by 
35 those skilled in this art. Preferred are those nucleic acid fragments whose nucleotide 

sequences encode amino acid sequences that are 80% similar to the amino acid sequences 
reported herein. More preferred nucleic acid fragments encode amino acid sequences that 
are 90% similar to the amino acid sequences reported herein. Most preferred are nucleic acid 
fragments that encode amino acid sequences that are 95% similar to the amino acid 

5 
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sequences reported herein. Sequence alignments and percent similarity calculations were 
performed using the Megalign program of the LASARGENE bioinformatics computing suite 
(DNASTAR Inc., Madison, WI). Multiple alignment of the sequences was performed using 
the Clustal method of alignment (Higgins, D. G. and Sharp, P. M. (1989) CABIOS. 

5 5: 151-153) with the default parameters (GAP PENALTY=10, GAP LENGTH 

PENALTY=10) (hereafter Clustal algorithm). Default parameters for pairwise alignments 
using the Clustal method were KTUPLE 1 , GAP PENALTY=3, WINDOW=5 and 
DIAGONALS SAVED=5. 

A "substantial portion" of an amino acid or nucleotide sequence comprises enough of 

10 the amino acid sequence of a polypeptide or the nucleotide sequence of a gene to afford 
putative identification of that polypeptide or gene, either by manual evaluation of the 
sequence by one skilled in the art, or by computer-automated sequence comparison and 
identification using algorithms such as BLAST (Basic Local Alignment Search Tool; 
Altschul, S. F., et al., (1993) J. Mol Biol 275:403-410; see also 

15 www.ncbi.nlm.nih.gov/BLAST/). In general, a sequence of ten or more contiguous amino 
acids or thirty or more nucleotides is necessary in order to putatively identify a polypeptide 
or nucleic acid sequence as homologous to a known protein or gene. Moreover, with respect 
to nucleotide sequences, gene specific oligonucleotide probes comprising 20-30 contiguous 
nucleotides may be used in sequence-dependent methods of gene identification (e.g., 

20 Southern hybridization) and isolation (e.g., in situ hybridization of bacterial colonies or 
bacteriophage plaques). In addition, short oligonucleotides of 12-15 bases may be used as 
amplification primers in PCR in order to obtain a particular nucleic acid fragment 
comprising the primers. Accordingly, a "substantial portion" of a nucleotide sequence 
comprises enough of the sequence to afford specific identification and/or isolation of a 

25 nucleic acid fragment comprising the sequence. The instant specification teaches partial or 
complete amino acid and nucleotide sequences encoding one or more particular plant 
proteins. The skilled artisan, having the benefit of the sequences as reported herein, may 
now use all or a substantial portion of the disclosed sequences for purposes known to those 
skilled in this art. Accordingly, the instant invention comprises the complete sequences as 

30 reported in the accompanying Sequence Listing, as well as substantial portions of those 
sequences as defined above. 

"Codon degeneracy" refers to divergence in the genetic code permitting variation of 
the nucleotide sequence without effecting the amino acid sequence of an encoded 
polypeptide. Accordingly, the instant invention relates to any nucleic acid fragment that 

35 encodes all or a substantial portion of the amino acid sequence encoding the CDC-1 6, DP-1 , 
DP-2 or E2F proteins as set forth in SEQ ID NOs:2, 4, 6, 8, 10, 12, 14, 16 and 18. The 
skilled artisan is well aware of the "codon-bias" exhibited by a specific host cell in usage of 
nucleotide codons to specify a given amino acid. Therefore, when synthesizing a gene for 
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improved expression in a host cell, it is desirable to design the gene such that its frequency 
of codon usage approaches the frequency of preferred codon usage of the host cell. 

"Synthetic genes" can be assembled from oligonucleotide building blocks that are 
chemically synthesized using procedures known to those skilled in the art. These building 
5 blocks are ligated and annealed to form gene segments which are then enzymatically 

assembled to construct the entire gene. "Chemically synthesized", as related to a sequence 
of DNA, means that the component nucleotides were assembled in vitro. Manual chemical 
synthesis of DNA may be accomplished using well established procedures, or automated 
chemical synthesis can be performed using one of a number of commercially available 
10 machines. Accordingly, the genes can be tailored for optimal gene expression based on 
optimization of nucleotide sequence to reflect the codon bias of the host cell. The skilled 
artisan appreciates the likelihood of successful gene expression if codon usage is biased 
towards those codons favored by the host. Determination of preferred codons can be based 
on a survey of genes derived from the host cell where sequence information is available. 
15 "Gene" refers to a nucleic acid fragment that expresses a specific protein, including 

regulatory sequences preceding (5 ! non-coding sequences) and following (3* non-coding 
sequences) the coding sequence. "Native gene" refers to a gene as found in nature with its 
own regulatory sequences. "Chimeric gene" refers any gene that is not a native gene, 
comprising regulatory and coding sequences that are not found together in nature. 
20 Accordingly, a chimeric gene may comprise regulatory sequences and coding sequences that 
are derived from different sources, or regulatory sequences and coding sequences derived 
from the same source, but arranged in a manner different than that found in nature. 
"Endogenous gene" refers to a native gene in its natural location in the genome of an 
organism. A "foreign" gene refers to a gene not normally found in the host organism, but 
25 that is introduced into the host organism by gene transfer. Foreign genes can comprise 

native genes inserted into a non-native organism, or chimeric genes. A "transgene" is a gene 
that has been introduced into the genome by a transformation procedure. 

"Coding sequence" refers to a DNA sequence that codes for a specific amino acid 
sequence. "Regulatory sequences" refer to nucleotide sequences located upstream (5* non- 
30 coding sequences), within, or downstream (3* non-coding sequences) of a coding sequence, 
and which influence the transcription, RNA processing or stability, or translation of the 
associated coding sequence. Regulatory sequences may include promoters, translation 
leader sequences, introns, and polyadenylation recognition sequences. 

"Promoter" refers to a DNA sequence capable of controlling the expression of a 
35 coding sequence or functional RNA. In general, a coding sequence is located 3' to a 

promoter sequence. The promoter sequence consists of proximal and more distal upstream 
elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a 
DNA sequence which can stimulate promoter activity and may be an innate element of the 
promoter or a heterologous element inserted to enhance the level or tissue-specificity of a 
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promoter. Promoters may be derived in their entirety from a native gene, or be composed of 
different elements derived from different promoters found in nature, or even comprise 
synthetic DNA segments. It is understood by those skilled in the art that different promoters 
may direct the expression of a gene in different tissues or cell types, or at different stages of 
5 development, or in response to different environmental conditions. Promoters which cause a 
gene to be expressed in most cell types at most times are commonly referred to as 
"constitutive promoters". New promoters of various types useful in plant cells are 
constantly being discovered; numerous examples may be found in the compilation by 
Okamuro and Goldberg, (1989) Biochemistry of Plants 75: 1-82. It is further recognized that 
10 since in most cases the exact boundaries of regulatory sequences have not been completely 
defined, DNA fragments of different lengths may have identical promoter activity. 

The "translation leader sequence" refers to a DNA sequence located between the 
promoter sequence of a gene and the coding sequence. The translation leader sequence is 
present in the fully processed mRNA upstream of the translation start sequence. The 
15 translation leader sequence may affect processing of the primary transcript to mRNA, 

mRNA stability or translation efficiency. Examples of translation leader sequences have 
been described (Turner, R. and Foster, G. D. (1995) Molecular Biotechnology 3:225). 

The "3' non-coding sequences" refer to DNA sequences located downstream of a 
coding sequence and include polyadenylation recognition sequences and other sequences 
20 encoding regulatory signals capable of affecting mRNA processing or gene expression. The 
polyadenylation signal is usually characterized by affecting the addition of polyadenylic acid 
tracts to the 3 T end of the mRNA precursor. The use of different 3' non-coding sequences is 
exemplified by Ingelbrecht et aL, (1989) Plant Cell 7:671-680. 

"RNA transcript" refers to the product resulting from RNA polymerase-catalyzed 
25 transcription of a DNA sequence. When the RNA transcript is a perfect complementary 
copy of the DNA sequence, it is referred to as the primary transcript or it may be a RNA 
sequence derived from posttranscriptional processing of the primary transcript and is 
referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the RNA that is 
without introns and that can be translated into protein by the cell. "cDNA" refers to a 
30 double-stranded DNA that is complementary to and derived from mRNA. "Sense" RNA 

refers to RNA transcript that includes the mRNA and so can be translated into protein by the 
cell. "Antisense RNA" refers to a RNA transcript that is complementary to all or part of a 
target primary transcript or mRNA and that blocks the expression of a target gene (U.S. Pat. 
No. 5,107,065, incorporated herein by reference). The complementarity of an antisense 
35 RNA may be with any part of the specific gene transcript, i.e., at the 5 f non-coding sequence, 
3' non-coding sequence, introns, or the coding sequence. "Functional RNA" refers to sense 
RNA, antisense RNA, ribozyme RNA, or other RNA that may not be translated but yet has 
an effect on cellular processes. 

8 
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The term "operably linked" refers to the association of nucleic acid sequences on a 
single nucleic acid fragment so that the function of one is affected by the other. For 
example, a promoter is operably linked with a coding sequence when it is capable of 
affecting the expression of that coding sequence (i.e., that the coding sequence is under the 
5 transcriptional control of the promoter). Coding sequences can be operably linked to 
regulatory sequences in sense or antisense orientation. 

The term "expression", as used herein, refers to the transcription and stable 
accumulation of sense (mRNA) or antisense RNA derived from the nucleic acid fragment of 
the invention. Expression may also refer to translation of mRNA into a polypeptide. 
10 "Antisense inhibition" refers to the production of antisense RNA transcripts capable of 

suppressing the expression of the target protein. "Overexpression" refers to the production 
of a gene product in transgenic organisms that exceeds levels of production in normal or 
non-transformed organisms. "Co-suppression" refers to the production of sense RNA 
transcripts capable of suppressing the expression of identical or substantially similar foreign 
15 or endogenous genes (U.S. Patent No. 5,231,020, incorporated herein by reference). 

"Altered levels" refers to the production of gene product(s) in transgenic organisms in 
amounts or proportions that differ from that of normal or non-transformed organisms. 

"Mature" protein refers to a post-translationally processed polypeptide; i.e., one from 
which any pre- or propeptides present in the primary translation product have been removed. 
20 "Precursor" protein refers to the primary product of translation of mRNA; i.e., with pre- and 
propeptides still present. Pre- and propeptides may be but are not limited to intracellular 
localization signals. 

A "chloroplast transit peptide" is an amino acid sequence which is translated in 
conjunction with a protein and directs the protein to the chloroplast or other plastid types 

25 present in the cell in which the protein is made. "Chloroplast transit sequence" refers to a 
nucleotide sequence that encodes a chloroplast transit peptide. A "signal peptide" is an 
amino acid sequence which is translated in conjunction with a protein and directs the protein 
to the secretory system (Chrispeels, J. J., (1991) Ann. Rev. Plant Phys. Plant Mol Biol 
42:21-53). If the protein is to be directed to a vacuole, a vacuolar targeting signal (supra) 

30 can further be added, or if to the endoplasmic reticulum, an endoplasmic reticulum retention 
signal (supra) may be added. If the protein is to be directed to the nucleus, any signal 
peptide present should be removed and instead a nuclear localization signal included 
(Raikhel (1992) Plant Phys. 700:1627-1632). 

"Transformation" refers to the transfer of a nucleic acid fragment into the genome of a 

35 host organism, resulting in genetically stable inheritance. Host organisms containing the 
transformed nucleic acid fragments are referred to as "transgenic" organisms. Examples of 
methods of plant transformation include Agrobacterium-mediated transformation (De Blaere 
et al. (1987) Meth Emymol 143:211) and particle-accelerated or "gene gun" transformation 
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technology (Klein et al. (1987) Afa/are (London) 527:70-73; U.S. Patent No. 4,945,050, 
incorporated herein by reference). 

Standard recombinant DNA and molecular cloning techniques used herein are well 
known in the art and are described more fully in Sambrook, J., Fritsch, E. F. and Maniatis, T. 
5 Molecular Cloning: A Laboratory Manual', Cold Spring Harbor Laboratory Press: Cold 
Spring Harbor, 1989 (hereinafter "Maniatis"). 

Nucleic acid fragments encoding at least a portion of several cell cycle regulatory 
proteins have been isolated and identified by comparison of random plant cDNA sequences 
to public databases containing nucleotide and protein sequences using the BLAST 
10 algorithms well known to those skilled in the art. Table 1 lists the proteins that are described 
herein, and the designation of the cDNA clones that comprise the nucleic acid fragments 



encoding these proteins. 

TABLE 1 

15 Cell Cycle Regulatory Proteins 

Enzyme Clone Plant 

CDC-16 cpflc.pk001.kl3 Corn 

sfll.pk0030.e3 Soybean 
DP-1 ids.pk0025.f7 Impatiens 

p0072.comfs64r Corn 
src3c.pk0 1 8.m 1 1 Soybean 
DP-2 p0005.cbmfh22r Corn 

cdelc.pk001.jl3 Corn 
cen3n.pk0183.b9 Corn 
wlmkl .pk0005.e2 Wheat 
E2F rslln.pk004.dl5 Rice 

sel.pk0012.f4 Soybean 



The nucleic acid fragments of the instant invention may be used to isolate cDNAs and 
genes encoding homologous proteins from the same or other plant species. Isolation of 
homologous genes using sequence-dependent protocols is well known in the art. Examples 
20 of sequence-dependent protocols include, but are not limited to, methods of nucleic acid 

hybridization, and methods of DNA and RNA amplification as exemplified by various uses 
of nucleic acid amplification technologies (e.g., polymerase chain reaction, ligase chain 
reaction). 

For example, genes encoding other CDC-16, DP-1, DP-2 or E2F proteins, either as 
25 cDNAs or genomic DNAs, could be isolated directly by using all or a portion of the instant 
nucleic acid fragments as DNA hybridization probes to screen libraries from any desired 
plant employing methodology well known to those skilled in the art. Specific 
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oligonucleotide probes based upon the instant nucleic acid sequences can be designed and 
synthesized by methods known in the art (Maniatis). Moreover, the entire sequences can be 
used directly to synthesize DNA probes by methods known to the skilled artisan such as 
random primer DNA labeling, nick translation, or end-labeling techniques, or RNA probes 
5 using available in vitro transcription systems. In addition, specific primers can be designed 
and used to amplify a part or all of the instant sequences. The resulting amplification 
products can be labeled directly during amplification reactions or labeled after amplification 
reactions, and used as probes to isolate full length cDNA or genomic fragments under 
conditions of appropriate stringency. 
10 In addition, two short segments of the instant nucleic acid fragments may be used in 

polymerase chain reaction protocols to amplify longer nucleic acid fragments encoding 
homologous genes from DNA or RNA. The polymerase chain reaction may also be 
performed on a library of cloned nucleic acid fragments wherein the sequence of one primer 
is derived from the instant nucleic acid fragments, and the sequence of the other primer takes 
1 5 advantage of the presence of the polyadenylic acid tracts to the 3' end of the mRN A 

precursor encoding plant genes. Alternatively, the second primer sequence may be based 
upon sequences derived from the cloning vector. For example, the skilled artisan can follow 
the RACE protocol (Frohman et al., (1988) PNAS USA 55:8998) to generate cDNAs by 
using PCR to amplify copies of the region between a single point in the transcript and the 3' 
20 or 5' end. Primers oriented in the 3' and 5' directions can be designed from the instant 

sequences. Using commercially available 3' RACE or 5' RACE systems (BRL), specific 3' 
or 5' cDNA fragments can be isolated (Ohara et al., (1989) PNAS USA 56:5673; Loh et al., 
(1989) Science 245:217). Products generated by the 3' and 5* RACE procedures can be 
combined to generate full-length cDNAs (Frohman, M. A. and Martin, G. R., (1989) 
25 Techniques 7:165). 

Availability of the instant nucleotide and deduced amino acid sequences facilitates 
immunological screening of cDNA expression libraries. Synthetic peptides representing 
portions of the instant amino acid sequences may be synthesized. These peptides can be 
used to immunize animals to produce polyclonal or monoclonal antibodies with specificity 
30 for peptides or proteins comprising the amino acid sequences. These antibodies can be then 
be used to screen cDNA expression libraries to isolate full-length cDNA clones of interest 
(Lerner, R. A. (1984) Adv. Immunol 36:1; Maniatis). 

The nucleic acid fragments of the instant invention may be used to create transgenic 
plants in which the disclosed CDC- 16, DP-1, DP-2 or E2F proteins are present at higher or 
35 lower levels than normal or in cell types or developmental stages in which they are not 
normally found. This would have the effect of altering cell cycle regulation those cells. 

Overexpression of the CDC- 16, DP-1 , DP-2 or E2F proteins of the instant invention 
may be accomplished by first constructing a chimeric gene in which the coding region is 
operably linked to a promoter capable of directing expression of a gene in the desired tissues 

11 



BNSDOCID: <WO 9953075A2J_> 



WO 99/53075 A A PCT/US9 9/07638 



at the desired stage of development. For reasons of convenience, the chimeric gene may 
comprise promoter sequences and translation leader sequences derived from the same genes. 
3* Non-coding sequences encoding transcription termination signals may also be provided. 
The instant chimeric gene may also comprise one or more introns in order to facilitate gene 
5 expression. 

Plasmid vectors comprising the instant chimeric gene can then constructed. The 
choice of plasmid vector is dependent upon the method that will be used to transform host 
plants. The skilled artisan is well aware of the genetic elements that must be present on the 
plasmid vector in order to successfully transform, select and propagate host cells containing 

10 the chimeric gene. The skilled artisan will also recognize that different independent 

transformation events will result in different levels and patterns of expression (Jones et al., 
(1985) EMBOJ. 4:2411-2418; De Almeida et al., (1989) Mol Gen. Genetics 2 18:1 '8-86), 
and thus that multiple events must be screened in order to obtain lines displaying the desired 
expression level and pattern. Such screening may be accomplished by Southern analysis of 

15 DNA, Northern analysis of mRNA expression, Western analysis of protein expression, or 
phenotypic analysis. 

It may also be desirable to reduce or eliminate expression of genes encoding CDC- 1 6, 
DP-1 , DP-2 or E2F proteins in plants for some applications. In order to accomplish this, a 
chimeric gene designed for co-suppression of the instant cell cycle regulatory proteins can be 

20 constructed by linking a gene or gene fragment encoding a CDC- 16, DP-1, DP-2 or E2F 
protein to plant promoter sequences. Alternatively, a chimeric gene designed to express 
antisense RNA for all or part of the instant nucleic acid fragment can be constructed by 
linking the gene or gene fragment in reverse orientation to plant promoter sequences. Either 
the co-suppression or antisense chimeric genes could be introduced into plants via 

25 transformation wherein expression of the corresponding endogenous genes are reduced or 
eliminated. 

The instant CDC-1 6, DP-1 , DP-2 or E2F proteins (or portions thereof) may be 
produced in heterologous host cells, particularly in the cells of microbial hosts, and can be 
used to prepare antibodies to the these proteins by methods well known to those skilled in 

30 the art. The antibodies are useful for detecting CDC- 16, DP-1, DP-2 or E2F proteins in situ 
in cells or in vitro in cell extracts. Preferred heterologous host cells for production of the 
instant CDC- 16, DP-1, DP-2 or E2F proteins are microbial hosts. Microbial expression 
systems and expression vectors containing regulatory sequences that direct high level 
expression of foreign proteins are well known to those skilled in the art. Any of these could 

35 be used to construct a chimeric gene for production of the instant CDC- 16, DP-1, DP-2 or 
E2F proteins. This chimeric gene could then be introduced into appropriate microorganisms 
via transformation to provide high level expression of the encoded cell cycle regulatory 
protein. An example of a vector for high level expression of the instant CDC- 16, DP-1 , 
DP-2 or E2F proteins in a bacterial host is provided (Example 9). 

12 
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Additionally, the instant CDC-16, DP-1, DP-2 or E2F proteins can be used as a targets 
to facilitate design and/or identification of inhibitors of those proteins that may be useful as 
herbicides. This is desirable because the CDC-16, DP-1, DP-2 or E2F proteins described 
herein are involved in the regulation of cell cycle. Accordingly, inhibition of the activity of 
5 one or more of the proteins described herein could lead to inhibition plant growth. Thus, the 
instant CDC-16, DP-1, DP-2 or E2F proteins could be appropriate for new herbicide 

discovery and design. 

All or a substantial portion of the nucleic acid fragments of the instant invention may 
also be used as probes for genetically and physically mapping the genes that they are a part 
10 of, and as markers for traits linked to those genes. Such information may be useful in plant 
breeding in order to develop lines with desired phenotypes. For example, the instant nucleic 
acid fragments may be used as restriction fragment length polymorphism (RFLP) markers. 
Southern blots (Maniatis) of restriction-digested plant genomic DNA may be probed with 
the nucleic acid fragments of the instant invention. The resulting banding patterns may then 
15 be subjected to genetic analyses using computer programs such as MapMaker (Lander et al., 
(1987) Genomics 7 :174-1 81) in order to construct a genetic map. In addition, the nucleic 
acid fragments of the instant invention may be used to probe Southern blots containing 
restriction endonuclease-treated genomic DNAs of a set of individuals representing parent 
and progeny of a defined genetic cross. Segregation of the DNA polymorphisms is noted 
20 and used to calculate the position of the instant nucleic acid sequence in the genetic map 
previously obtained using this population (Botstein, D. et al., (1980) Am. J. Hum. Genet. 
32:314-331). 

The production and use of plant gene-derived probes for use in genetic mapping is 
described in R. Bematzky, R. and Tanksley, S. D. (1986) Plant Mol. Biol. Reporter 
25 4(l):37-4\. Numerous publications describe genetic mapping of specific cDNA clones 
using the methodology outlined above or variations thereof. For example, F2 intercross 
populations, backcross populations, randomly mated populations, near isogenic lines, and 
other sets of individuals may be used for mapping. Such methodologies are well known to 
those skilled in the art. 

30 Nucleic acid probes derived from the instant nucleic acid sequences may also be used 

for physical mapping (i.e., placement of sequences on physical maps; see Hoheisel, J. D., et 
al., In: Nonmammalian Genomic Analysis: A Practical Guide, Academic press 1996, 
pp. 319-346, and references cited therein). 

In another embodiment, nucleic acid probes derived from the instant nucleic acid 
35 sequences may be used in direct fluorescence in situ hybridization (FISH) mapping (Trask, 
B. J. (1991) Trends Genet. 7:149-154). Although current methods of FISH mapping favor 
use of large clones (several to several hundred KB; see Laan, M. et al. (1995) Genome 
Research 5:13-20), improvements in sensitivity may allow performance of FISH mapping 
using shorter probes. 
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A variety of nucleic acid amplification-based methods of genetic and physical 
mapping may be carried out using the instant nucleic acid sequences. Examples include 
allele-specific amplification (Kazazian, H. H. (1989) J. Lab. Clin. Med. 1 14(2):95-96\ 
polymorphism of PCR-amplified fragments (CAPS; Sheffield, V. C. et al. (1993) Genomics 
5 76:325-332), allele-specific ligation (Landegren, U. et al. (1988) Science 247:1077-1080), 
nucleotide extension reactions (Sokolov, B. P. (1990) Nucleic Acid Res. 75:3671), Radiation 
Hybrid Mapping (Walter, M. A. et al. (1997) Nature Genetics 7:22-28) and Happy Mapping 
(Dear, P. H. and Cook, P. R. (1989) Nucleic Acid Res. 77:6795-6807). For these methods, 
the sequence of a nucleic acid fragment is used to design and produce primer pairs for use in 

10 the amplification reaction or in primer extension reactions. The design of such primers is 
well known to those skilled in the art. In methods employing PCR-based genetic mapping, 
it may be necessary to identify DNA sequence differences between the parents of the 
mapping cross in the region corresponding to the instant nucleic acid sequence. This, 
however, is generally not necessary for mapping methods. 

15 Loss of function mutant phenotypes may be identified for the instant cDNA clones 

either by targeted gene disruption protocols or by identifying specific mutants for these 
genes contained in a maize population carrying mutations in all possible genes (Ballinger 
and Benzer, (1989) Proc. Natl Acad Sci USA 56:9402; Koes et al., (1995) Proc. Natl. Acad. 
Sci USA 92:8149; Bensen et al., (1995) Plant Cell 7:75). The latter approach may be 

20 accomplished in two ways. First, short segments of the instant nucleic acid fragments may 
be used in polymerase chain reaction protocols in conjunction with a mutation tag sequence 
primer on DNAs prepared from a population of plants in which Mutator transposons or some 
other mutation-causing DNA element has been introduced (see Bensen, supra). The 
amplification of a specific DNA fragment with these primers indicates the insertion of the 

25 mutation tag element in or near the plant gene encoding the CDC- 16, DP-1, DP-2 or E2F 
protein. Alternatively, the instant nucleic acid fragment may be used as a hybridization 
probe against PCR amplification products generated from the mutation population using the 
mutation tag sequence primer in conjunction with an arbitrary genomic site primer, such as 
that for a restriction enzyme site-anchored synthetic adaptor. With either method, a plant 

30 containing a mutation in the endogenous gene encoding a CDC- 16, DP-1, DP-2 or E2F 

protein can be identified and obtained. This mutant plant can then be used to determine or 
confirm the natural function of the CDC-16, DP-1, DP-2 or E2F protein gene product. 

EXAMPLES 

The present invention is further defined in the following Examples, in which all parts 
35 and percentages are by weight and degrees are Celsius, unless otherwise stated. It should be 
understood that these Examples, while indicating preferred embodiments of the invention, 
are given by way of illustration only. From the above discussion and these Examples, one 
skilled in the art can ascertain the essential characteristics of this invention, and without 
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departing from the spirit and scope thereof, can make various changes and modifications of 
the invention to adapt it to various usages and conditions. 

EXAMPLE 1 

re position of f.DNA Libraries: Isolation a nd Sequencing of cDNA Clones 
5 cDNA libraries representing mRNAs from various corn, Impatiens, rice, soybean and 

wheat tissues were prepared. The characteristics of the libraries are described below. 

TABLE 2 

cDNA Libraries from Corn, Impatiens, Rice. Soybean and Wheat 



10 



15 



Library 



Tissue 



Clone 



cdelc Corn developing embryo 20 days after pollination 
cen3n Corn endosperm days after pollination* 
cpf 1 c Corn tissue treated with chemicals related to protein 

synthesis and pooled* * * 
ids Impatiens balsamina developing seed 

p0005 Corn immature ear 

p0072 Corn 1 4 days after planting etiolated seedling: Mesocotyl 

rsl 1 n Rice 1 5 day old seedling* 

se 1 Soybean embryo, 6- 1 0 day after flowering 

sfll Soybean immature flower 

src3c Soybean 8 day old root inoculated with eggs of cyst 
nematode Heterodera glycines (Race 14) for 4 days, 
wlmkl Wheat seedlings 1 hr after inoculation with Erysiphe 
praminis f. sp tritici and treatment with fungicide* * 



20 



cdelc.pk001.jl3 

cen3n.pk0183.b9 

cpflc.pk001.kl3 

ids.pk0025.f7 

p0005.cbmfh22r 

p0072.comfs64r 

rslln.pk004.dl5 

sel.pk0012.f4 

sfll.pk0030.e3 

src3c.pk018.mll 

wlmkl. pk0005.e2 



25 



♦These libraries were normalized essentially as described in U.S. Patent No. 5,482.845 
♦♦Application of 6-iodo-2-propoxy-3-propyl-4(3tf)-quinazolinone; synthesis and methods 
of using this compound are described in USSN 08/545,827, incorporated herein by 
reference. 

♦♦♦Chloramphenicol, cyclohexamide, neomycin sulfate and aurintricarboxilic acid 

commercially available from Sigma Chemical Co. and Calbiochem-Novabiochem Corp 
Calbiochem. 

cDNA libraries were prepared in Uni-ZAP™ XR vectors according to the 
manufacturer's protocol (Stratagene Cloning Systems, La Jolla, CA). Conversion of the 
Uni-ZAP™ XR libraries into plasmid libraries was accomplished according to the protocol 
provided by Stratagene. Upon conversion, cDNA inserts were contained in the plasmid 
vector pBluescript. cDNA inserts from randomly picked bacterial colonies containing 
recombinant pBluescript plasmids were amplified via polymerase chain reaction using 
primers specific for vector sequences flanking the inserted cDNA sequences or plasmid 
DNA was prepared from cultured bacterial cells. Amplified insert DNAs or plasmid DN As 
were sequenced in dye-primer sequencing reactions to generate partial cDNA sequences 
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(expressed sequence tags or "ESTs"; see Adams, M. D. et ah, (1991) Science 252:1651). 
The resulting ESTs were analyzed using a Perkin Elmer Model 377 fluorescent sequencer. 

EXAMPLE 2 
Identification of cDNA Clones 
5 ESTs encoding cell cycle regulatory proteins were identified by conducting BLAST 

(Basic Local Alignment Search Tool; Altschul, S. F., et al., (1993) J. MoL BioL 
275:403-410; see also www.ncbi.nlm.nih.gov/BLAST/) searches for similarity to 
sequences contained in the BLAST "nr" database (comprising all non-redundant GenBank 
CDS translations, sequences derived from the 3-dimensional structure Brookhaven Protein 

10 Data Bank, the last major release of the SWISS-PROT protein sequence database, EMBL, 
and DDBJ databases). The cDNA sequences obtained in Example 1 were analyzed for 
similarity to all publicly available DNA sequences contained in the "nr" database using the 
BLASTN algorithm provided by the National Center for Biotechnology Information 
(NCBI). The DNA sequences were translated in all reading frames and compared for 

15 similarity to all publicly available protein sequences contained in the "nr" database using 
the BLASTX algorithm (Gish, W. and States, D. J. (1993) Nature Genetics 5:266-272 and 
Altschul, Stephen F., et al. (1997) Nucleic Acids Res, 25:3389-3402) provided by the NCBI. 
For convenience, the P-value (probability) of observing a match of a cDNA sequence to a 
sequence contained in the searched databases merely by chance as calculated by BLAST 

20 are reported herein as "pLog" values, which represent the negative of the logarithm of the 
reported P-value. Accordingly, the greater the pLog value, the greater the likelihood that 
the cDNA sequence and the BLAST "hit" represent homologous proteins. 

EXAMPLE 3 

Characterization of cDNA Clones Encoding CDC- 16 Proteins 
25 The BLASTX search using the EST sequences from clones cpflc.pkOOl .kl3 and 

sfll.pk0030.e3 revealed similarity of the proteins encoded by the cDNAs to CDC- 16 protein 
from Homo sapiens (NCBI Identifier No. gi 1362769). The BLAST results for each of these 
ESTs are shown in Table 3: 

30 TABLE 3 



BLAST Results for Clones Encoding Polypeptides Homologous 
to Homo sapiens CDC- 16 Protein 



Clone 


BLAST pLog Score 


cpflc.pk001.kl3 


11.00 


sfll.pk0030.e3 


50.40 



The sequence of a portion of the cDNA insert from clone cpflc.pk001.kl3 is shown in 
35 SEQ ID NO: 1 ; the deduced amino acid sequence of this cDN A, which represents 1 4% of the 
N-terminal region of the protein, is shown in SEQ ID NO:2. A calculation of the percent 
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similarity of the amino acid sequence set forth in SEQ ID NO:2 and the Homo sapiens 
sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:2 is 
40% similar to the Homo sapiens CDC- 16 protein. 

The sequence of a portion of the cDNA insert from clone sfll.pk0030.e3 is shown in 
5 SEQ ID NO:3; the deduced amino acid sequence of this cDNA, which represents 43% of the 
protein (middle region), is shown in SEQ ID NO:4. A calculation of the percent similarity 
of the amino acid sequence set forth in SEQ ID NO:4 and the Homo sapiens sequence (using 
the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:4 is 46% similar to 
the Homo sapiens CDC-1 6 protein. The percent similarity between the corn and soybean 
10 amino acid sequence was calculated to be 1 8% using the Clustal algorithm. 

BLAST scores and probabilities indicate that the instant nucleic acid fragments encode 
portions of CDC-16 proteins. These sequences represent the first plant sequences encoding 
CDC- 16 proteins. 

EXAMPLE 4 

15 Characterization of cDNA C lones Encoding DP-1 Protein 

The BLASTX search using the EST sequence from clone ids.pk0025.f7 revealed 
similarity of the protein encoded by the cDNA to DP-1 protein from Xenopus laevis (NCBI 
Identifier No. gi 913227). The BLASTX search using the EST sequences from clones 
P 0072.comfs64r and src3c.pk018.ml 1 revealed similarity of the proteins encoded by the 
20 cDNAs to DP-1 protein from Mus musculus (NCBI Identifier No. gi 420232). 
The BLAST results for each of these ESTs are shown in Table 4: 

TABLE 4 

BLAST Results for Clones Encoding Polypeptides Homologous 

25 to Xenopus laevis and Mus musculus DP-1 Protein , 

Clone"™ BLAST pLog Score 

ids.pk0025.f7 23J0 
P 0072.comfs64r 5 - 00 

src3c.pk018.mll 1700 

The sequence of a portion of the cDNA insert from clone ids.pk0025.f7 is shown in 
SEQ ID NO:5; the deduced amino acid sequence of this cDNA, which represents 33% of the 
of the protein (middle region), is shown in SEQ ID NO:6. A calculation of the percent 
30 similarity of the amino acid sequence set forth in SEQ ID NO:8 and the Xenopus laevis 

sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:6 is 
46% similar to the Xenopus laevis DP-1 protein. 

The sequence of a portion of the cDNA insert from clone P 0072.comfs64r is shown in 
SEQ ID NO:7; the deduced amino acid sequence of this cDNA, which represents 23% of the 
35 protein (middle region), is shown in SEQ ID NO:8. A calculation of the percent similarity 
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of the amino acid sequence set forth in SEQ ID NO:8 and the Mus musculus sequence (using 

the Clustal algorithm) revealed that the protein encoded by SEQ ID NO:8 is 37% similar to 

the Mus musculus DP-1 protein. 

The sequence of a portion of the cDNA insert from clone src3c.pk018.ml 1 is shown in 
5 SEQ ID NO:9; the deduced amino acid sequence of this cDNA, which represents 42% of the 

protein (middle region), is shown in SEQ ID NO: 10. A calculation of the percent similarity 

of the amino acid sequence set forth in SEQ ID NO: 10 and the Mus musculus sequence 

(using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO: 10 is 48% 

similar to the Mus musculus DP-1 protein. 
10 The percent similarity between the corn, Impatiens and soybean amino acid sequence 

was calculated to range between 31 to 78% using the Clustal algorithm. 

BLAST scores and probabilities indicate that the instant nucleic acid fragments encode 

portions of DP-1 proteins. These sequences represent the first plant sequences encoding 

DP-1 proteins. 

15 EXAMPLE 5 

Characterization of cDNA Clones Encoding DP-2 Protein 
The BLASTX search using the EST sequences from clones p0005.cbmfh22r, 
cdelc.pk001.jl3 and cen3n.pk0183.b9 revealed similarity of the proteins encoded by the 
cDNAs to DP-2 protein from Homo sapiens (NCBI Identifier No. gi 604479). The 
20 BLASTX search using the EST sequence from clone wlmkl .pk0005.e2 revealed similarity 
of the protein encoded by the cDNA to DP-2 protein from Mus species (NCBI Identifier 
No. gi 3122929). 

In the process of comparing the ESTs it was found that corn clones p0005xbmfh22r, 
cdelc.pk001.jl3 and cen3n.pk0183.b9 had overlapping regions of homology. Using this 
25 homology it was possible to align the ESTs and assemble a contig encoding a unique corn 
DP-2 protein. 

The BLAST results for the corn contig and the wheat EST are shown in Table 5: 

TABLE 5 

30 BLAST Results for Clones Encoding Polypeptides Homologous 



to Homo sapiens and Mus species DP-2 Protein 



Clone 


BLAST pLog Score 


Contig composed of: 


45.10 


p0005.cbmfh22r 




cdelc.pk001.jl3 




cen3n.pk0183.b9 




wlmkl. pk0005.e2 


7.00 



The sequence of the corn contig composed of clones p0005.cbmfh22r, 
cdelc.pk001.jl3 and cen3n.pk0183.b9 is shown in SEQ ID NO:l 1; the deduced amino acid 
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sequence of this contig, which represents 50% of the protein (middle region), is shown in 
SEQ ID NO: 12. A calculation of the percent similarity of the amino acid sequence set forth 
in SEQ ID NO: 12 and the Homo sapiens sequence (using the Clustal algorithm) revealed 
that the protein encoded by SEQ ID NO:8 is 52% similar to the Homo sapiens DP-2 protein. 
5 The sequence of the entire cDNA insert from clone wlmkl . P k0005.e2 is shown in SEQ 

ID NO: 1 3 ; the deduced amino acid sequence of this cDNA, which represents 25% of the of 
the protein (middle region), is shown in SEQ ID NO:14. A calculation of the percent 
similarity of the amino acid sequence set forth in SEQ ID NO:14 and the Mus species 
sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO: 14 
10 is 44% similar to the Mus species DP-2 protein. 

The percent similarity between the com and wheat amino acid sequence was calculated 
to be 86% using the Clustal algorithm. 

BLAST scores and probabilities indicate that the instant nucleic acid fragments encode 
portions of DP-2 proteins. These sequences represent the first plant sequences encoding 
15 DP-2 proteins. 

EXAMPLE 6 

Characterization of cDNA Clones Enc oding F.2F Proteins 
The BLASTX search using the EST sequence from clone rslln.pk004.dl5 revealed 
similarity of the protein encoded by the cDNA to E2F-4 protein from Homo sapiens (NCBI 
20 Identifier No. gi 1 06 1 1 46). The BLASTX search using the EST sequence from clone 

sel.pk0012.f4 revealed similarity of the protein encoded by the cDNA to E2F protein from 
Drosophila melanogaster (NCBI Identifier No. gi 3551069). 

The BLAST results for each of the ESTs are shown in Table 6: 

2 5 TABLE 6 

BLAST Results for Clones Encoding Polypeptides Homologous 

to Homo sapiens and Drosophila melanosaster E2 F Proteins 

Q one BLAST pLog Score 

rslln.pk004.dl5 5 - 52 

sel.pk0012.f4 . 1122 

The sequence of a portion of the cDNA insert from clone rslln.pk004.dl5 is shown in 
30 SEQ ID NO:15; the deduced amino acid sequence of this cDNA, which represents 12% of 
the of the protein (middle region), is shown in SEQ ID NO: 16. A calculation of the percent 
similarity of the amino acid sequence set forth in SEQ ID NO:16 and the Homo sapiens 
sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO: 1 6 
is 49% similar to the Homo sapiens E2F-4 protein. 
35 The sequence of the entire cDNA insert from clone sel.pkOO 1 2.f4 is shown in SEQ ID 

NO: 17; the deduced amino acid sequence of this cDNA, which represents 10% of the of the 
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protein (middle region), is shown in SEQ ID NO: 1 8. A calculation of the percent similarity 
of the amino acid sequence set forth in SEQ ID NO: 18 and the Drosophila melanogaster 
sequence (using the Clustal algorithm) revealed that the protein encoded by SEQ ID NO: 18 
is 43% similar to the Drosophila melanogaster E2F protein. 
5 The percent similarity between the rice and soybean amino acid sequence was 

calculated to be 15% using the Clustal algorithm. 

BLAST scores and probabilities indicate that the instant nucleic acid fragments encode 
portions of E2F proteins. These sequences represent the first plant sequences encoding E2F 
proteins. 

10 EXAMPLE 7 

Expression of Chimeric Genes in Monocot Cells 
A chimeric gene comprising a cDNA encoding a cell cycle regulatory protein in sense 
orientation with respect to the maize 27 kD zein promoter that is located 5' to the cDNA 
fragment, and the 10 kD zein 3 1 end that is located 3' to the cDNA fragment, can be 
15 constructed. The cDNA fragment of this gene may be generated by polymerase chain 

reaction (PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites 
(Ncol or Smal) can be incorporated into the oligonucleotides to provide proper orientation of 
the DNA fragment when inserted into the digested vector pML103 as described below. 
Amplification is then performed in a standard PCR. The amplified DNA is then digested 
20 with restriction enzymes Ncol and Smal and fractionated on an agarose gel. The appropriate 
band can be isolated from the gel and combined with a 4.9 kb Ncol-Smal fragment of the 
plasmid pML103. Plasmid pML103 has been deposited under the terms of the Budapest 
Treaty at ATCC (American Type Culture Collection, 10801 University Blvd., Manassas, VA 
201 10-2209), and bears accession number ATCC 97366. The DNA segment from pML103 
25 contains a 1.05 kb Sall-Ncol promoter fragment of the maize 27 kD zein gene and a 0.96 kb 
Smal-Sall fragment from the 3' end of the maize 10 kD zein gene in the vector pGem9Zf(+) 
(Promega). Vector and insert DNA can be ligated at 15°C overnight, essentially as 
described (Maniatis). The ligated DNA may then be used to transform E. coli XL 1 -Blue 
(Epicurian Coli XL-l Blue™; Stratagene). Bacterial transformants can be screened by 
30 restriction enzyme digestion of plasmid DNA and limited nucleotide sequence analysis using 
the dideoxy chain termination method (Sequenase™ DNA Sequencing Kit; U.S. 
Biochemical). The resulting plasmid construct would comprise a chimeric gene encoding, in 
the 5' to 3* direction, the maize 27 kD zein promoter, a cDNA fragment encoding a cell cycle 
regulatory protein, and the 10 kD zein 3 1 region. 
35 The chimeric gene described above can then be introduced into corn cells by the 

following procedure. Immature corn embryos can be dissected from developing caryopses 
derived from crosses of the inbred corn lines H99 and LH132. The embryos are isolated 10 
to 1 1 days after pollination when they are 1 .0 to 1 .5 mm long. The embryos are then placed 
with the axis-side facing down and in contact with agarose-solidified N6 medium (Chu 
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et al., (1975) Sci. Sin. Peking 18:659-668). The embryos are kept in the dark at 27°C. 
Friable embryogenic callus consisting of undifferentiated masses of cells with somatic 
proembryoids and embryoids borne on suspensor structures proliferates from the scutellum 
of these immature embryos. The embryogenic callus isolated from the primary explant can 
5 be cultured on N6 medium and sub-cultured on this medium every 2 to 3 weeks. 

The plasmid, p35S/Ac (obtained from Dr. Peter Eckes, Hoechst Ag, Frankfurt, 
Germany) may be used in transformation experiments in order to provide for a selectable 
marker. This plasmid contains the Pat gene (see European Patent Publication 0 242 236) 
which encodes phosphinothricin acetyl transferase (PAT). The enzyme PAT confers 
10 resistance to herbicidal glutamine synthetase inhibitors such as phosphinothricin. The pat 
gene in p35S/Ac is under the control of the 35S promoter from Cauliflower Mosaic Virus 
(Odell et al. (1985) Nature 313:810-812) and the 3' region of the nopaline synthase gene 
from the T-DNA of the Ti plasmid of Agrobacterium tumefaciens. 

The particle bombardment method (Klein et al., (1987) Nature 327:70-73) may be 
15 used to transfer genes to the callus culture cells. According to this method, gold particles 
(1 urn in diameter) are coated with DNA using the following technique. Ten pg of plasmid 
DNAs are added to 50 pL of a suspension of gold particles (60 mg per mL). Calcium 
chloride (50 pL of a 2.5 M solution) and spermidine free base (20 pL of a 1.0 M solution) 
are added to the particles. The suspension is vortexed during the addition of these solutions. 
20 After 1 0 minutes, the tubes are briefly centrifuged (5 sec at 1 5,000 rpm) and the supernatant 
removed. The particles are resuspended in 200 pL of absolute ethanol, centrifuged again 
and the supernatant removed. The ethanol rinse is performed again and the particles 
resuspended in a final volume of 30 pL of ethanol. An aliquot (5 pL) of the DNA-coated 
gold particles can be placed in the center of a Kapton™ flying disc (Bio-Rad Labs). The 
25 particles are then accelerated into the corn tissue with a Biolistic™ PDS-1000/He (Bio-Rad 
Instruments, Hercules CA), using a helium pressure of 1000 psi, a gap distance of 0.5 cm 
and a flying distance of 1 .0 cm. 

For bombardment, the embryogenic tissue is placed on filter paper over agarose- 
solidified N6 medium. The tissue is arranged as a thin lawn and covered a circular area of 
30 about 5 cm in diameter. The petri dish containing the tissue can be placed in the chamber of 
the PDS-1000/He approximately 8 cm from the stopping screen. The air in the chamber is 
then evacuated to a vacuum of 28 inches of Hg. The macrocarrier is accelerated with a 
helium shock wave using a rupture membrane that bursts when the He pressure in the shock 
tube reaches 1000 psi. 

35 Seven days after bombardment the tissue can be transferred to N6 medium that 

contains gluphosinate (2 mg per liter) and lacks casein or proline. The tissue continues to 
grow slowly on this medium. After an additional 2 weeks the tissue can be transferred to 
fresh N6 medium containing gluphosinate. After 6 weeks, areas of about 1 cm in diameter 
of actively growing callus can be identified on some of the plates containing the glufosinate- 
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supplemented medium. These calli may continue to grow when sub-cultured on the 
selective medium. 

Plants can be regenerated from the transgenic callus by first transferring clusters of 
tissue to N6 medium supplemented with 0.2 mg per liter of 2,4-D. After two weeks the 
5 tissue can be transferred to regeneration medium (Fromm et aL, (1990) Bio/T echnology 
5:833-839). 

EXAMPLE 8 
Expression of Chimeric Genes in Dicot Cells 
A seed-specific expression cassette composed of the promoter and transcription 
10 terminator from the gene encoding the (J subunit of the seed storage protein phaseolin from 
the bean Phaseolus vulgaris (Doyle et aL (1986) J. Biol Chem. 26\ :9228-9238) can be used 
for expression of the instant cell cycle regulatory protein in transformed soybean. The 
phaseolin cassette includes about 500 nucleotides upstream (5 1 ) from the translation initiation 
codon and about 1650 nucleotides downstream (3') from the translation stop codon of 
15 phaseolin. Between the 5* and 3* regions are the unique restriction endonuclease sites Nco I 
(which includes the ATG translation initiation codon), Sma I, Kpn I and Xba I. The entire 
cassette is flanked by Hind III sites. 

The cDNA fragment of this gene may be generated by polymerase chain reaction 
(PCR) of the cDNA clone using appropriate oligonucleotide primers. Cloning sites can be 
20 incorporated into the oligonucleotides to provide proper orientation of the DNA fragment 
when inserted into the expression vector. Amplification is then performed as described 
above, and the isolated fragment is inserted into a pUC18 vector carrying the seed 
expression cassette. 

Soybean embroys may then be transformed with the expression vector comprising a 
25 sequence encoding a cell cycle regulatory protein. To induce somatic embryos, cotyledons, 
3-5 mm in length dissected from surface sterilized, immature seeds of the soybean cultivar 
A2872, can be cultured in the light or dark at 26°C on an appropriate agar medium for 
6-10 weeks. Somatic embryos which produce secondary embryos are then excised and 
placed into a suitable liquid medium. After repeated selection for clusters of somatic 
30 embryos which multiplied as early, globular staged embryos, the suspensions are maintained 
as described below. 

Soybean embryogenic suspension cultures can maintained in 35 mL liquid media on a 
rotary shaker, 150 rpm, at 26°C with florescent lights on a 16:8 hour day/night schedule. 
Cultures are subcultured every two weeks by inoculating approximately 35 mg of tissue into 
35 35 mL of liquid medium. 

Soybean embryogenic suspension cultures may then be transformed by the method of 
particle gun bombardment (Kline et al. (1987) Nature (London) 327:70, U.S. Patent 
No. 4,945,050). A DuPont Biolistic™ PDS1000/HE instrument (helium retrofit) can be used 
for these transformations. 
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A selectable marker gene which can be used to facilitate soybean transformation is a 
chimeric gene composed of the 35S promoter from Cauliflower Mosaic Virus (Odell et ai. 
(1985) Nature 375:810-812), the hygromycin phosphotransferase gene from plasmid pJR225 
(from E. coli; Gritz et al. (1983) Gene 25:179-188) and the 3' region of the nopaline synthase 
5 gene from the T-DNA of the Ti plasmid of Agrobacterium tumcfaciens. The seed expression 
cassette comprising the phaseolin 5' region, the fragment encoding the cell cycle regulatory 
protein and the phaseolin 3' region can be isolated as a restriction fragment. This fragment 
can then be inserted into a unique restriction site of the vector carrying the marker gene. 
To 50 pL of a 60 mg/mL 1 \xm gold particle suspension is added (in order): 5 \xh 
10 DNA (1 |ag/|iL), 20 [i\ spermidine (0.1 M), and 50 pL CaCl 2 (2.5 M). The particle 

preparation is then agitated for three minutes, spun in a microfuge for 10 seconds and the 
supernatant removed. The DN A-coated particles are then washed once in 400 \iL 70% 
ethanol and resuspended in 40 \ih of anhydrous ethanol. The DNA/particle suspension can 
be sonicated three times for one second each. Five |iL of the DNA-coated gold particles are 
15 then loaded on each macro carrier disk. 

Approximately 300-400 mg of a two-week-old suspension culture is placed in an 
empty 60x15 mm petri dish and the residual liquid removed from the tissue with a pipette. 
For each transformation experiment, approximately 5-10 plates of tissue are normally 
bombarded. Membrane rupture pressure is set at 1 100 psi and the chamber is evacuated to a 
20 vacuum of 28 inches mercury. The tissue is placed approximately 3.5 inches away from the 
retaining screen and bombarded three times. Following bombardment, the tissue can be 
divided in half and placed back into liquid and cultured as described above. 

Five to seven days post bombardment, the liquid media may be exchanged with fresh 
media, and eleven to twelve days post bombardment with fresh media containing 50 mg/mL 
25 hygromycin. This selective media can be refreshed weekly. Seven to eight weeks post 
bombardment, green, transformed tissue may be observed growing from untransformed, 
necrotic embryogenic clusters. Isolated green tissue is removed and inoculated into 
individual flasks to generate new, clonally propagated, transformed embryogenic suspension 
cultures. Each new line may be treated as an independent transformation event. These 
30 suspensions can then be subcultured and maintained as clusters of immature embryos or 

regenerated into whole plants by maturation and germination of individual somatic embryos. 

EXAMPLE 9 
Expression of Chimeric Genes in Microbial Cells 
The cDN As encoding the instant cell cycle regulatory proteins can be inserted into the 
35 T7 E. coli expression vector pBT430. This vector is a derivative of pET-3a (Rosenberg et al. 
(1987) Gene 56:125-135) which employs the bacteriophage T7 RNA polymerase/T7 
promoter system. Plasmid pBT430 was constructed by first destroying the EcoR I and 
Hind III sites in pET-3a at their original positions. An oligonucleotide adaptor containing 
EcoR I and Hind III sites was inserted at the BamH I site of pET-3a. This created pET-3aM 
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with additional unique cloning sites for insertion of genes into the expression vector. Then, 
the Nde I site at the position of translation initiation was converted to an Nco I site using 
oligonucleotide-directed mutagenesis. The DNA sequence of pET-3aM in this region, 
S'-CATATGG, was converted to 5'-CCCATGG in pBT430. 

Plasmid DNA containing a cDNA may be appropriately digested to release a nucleic 
acid fragment encoding the protein. This fragment may then be purified on a 1% NuSieve 
GTG™ low melting agarose gel (FMC). Buffer and agarose contain 10 pg/ml ethidium 
bromide for visualization of the DNA fragment. The fragment can then be purified from the 
agarose gel by digestion with GELase™ (Epicentre Technologies) according to the 
manufacturer's instructions, ethanol precipitated, dried and resuspended in 20 \ih of water. 
Appropriate oligonucleotide adapters may be ligated to the fragment using T4 DNA ligase 
(New England Biolabs, Beverly, MA). The fragment containing the ligated adapters can be 
purified from the excess adapters using low melting agarose as described above. The vector 
pBT430 is digested, dephosphorylated with alkaline phosphatase (NEB) and deproteinized 
with phenol/chloroform as decribed above. The prepared vector pBT430 and fragment can 
then be ligated at 16°C for 15 hours followed by transformation into DH5 electrocompetent 
cells (GIBCO BRL). Transformants can be selected on agar plates containing LB media and 
100 |ig/mL ampicillin. Transformants containing the gene encoding the cell cycle regulatory 
protein are then screened for the correct orientation with respect to the T7 promoter by 
restriction enzyme analysis. 

For high level expression, a plasmid clone with the cDNA insert in the correct 
orientation relative to the T7 promoter can be transformed into E. coli strain BL21(DE3) 
(Studier et al. (1986) J. MoL Biol 759:1 13-130). Cultures are grown in LB medium 
containing ampicillin (100 mg/L) at 25°C. At an optical density at 600 nm of approximately 
1, IPTG (isopropylthio-P-galactoside, the inducer) can be added to a final concentration of 
0.4 mM and incubation can be continued for 3 h at 25°. Cells are then harvested by 
centrifiigation and re-suspended in 50 |iL of 50 mM Tris-HCl at pH 8.0 containing 0.1 mM 
DTT and 0.2 mM phenyl methylsulfonyl fluoride. A small amount of 1 mm glass beads can 
be added and the mixture sonicated 3 times for about 5 seconds each time with a microprobe 
sonicator. The mixture is centrifuged and the protein concentration of the supernatant 
determined. One \ig of protein from the soluble fraction of the culture can be separated by 
SDS-polyacrylamide gel electrophoresis. Gels can be observed for protein bands migrating 
at the expected molecular weight. 

EXAMPLE 10 

Evaluating Compounds for Their Ability to Inhibit the Activity 
of Cell Cycle Regulatory Proteins 
The cell cycle regulatory proteins described herein may be produced using any number 
of methods known to those skilled in the art. Such methods include, but are not limited to, 
expression in bacteria as described in Example 9, or expression in eukaryotic cell culture, 
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inplanta, and using viral expression systems in suitably infected organisms or cell lines. 
The instant ceil cycle regulatory proteins may be expressed either as mature forms of the 
proteins as observed in vivo or as fusion proteins by covalent attachment to a variety of 
enzymes, proteins or affinity tags. Common fusion protein partners include glutathione 
5 S-transferase ("GST"), thioredoxin ("Trx"), maltose binding protein, and C- and/or 

N-terminal hexahistidine polypeptide ("(His) 6 "). The fusion proteins may be engineered 
with a protease recognition site at the fusion point so that fusion partners can be separated by 
protease digestion to yield intact mature enzyme. Examples of such proteases include 
thrombin, enterokinase and factor Xa. However, any protease can be used which specifically 
10 cleaves the peptide connecting the fusion protein and the enzyme. 

Purification of the instant cell cycle regulatory proteins, if desired, may utilize any 
number of separation technologies familiar to those skilled in the art of protein purification. 
Examples of such methods include, but are not limited to, homogenization, filtration, 
centrifugation, heat denaturation, ammonium sulfate precipitation, desalting, pH 
15 precipitation, ion exchange chromatography, hydrophobic interaction chromatography and 
affinity chromatography, wherein the affinity ligand represents a substrate, substrate analog 
or inhibitor. When the cell cycle regulatory proteins are expressed as fusion proteins, the 
purification protocol may include the use of an affinity resin which is specific for the fusion 
protein tag attached to the expressed enzyme or an affinity resin containing ligands which are 
20 specific for the enzyme. For example, a cell cycle regulatory protein may be expressed as a 
fusion protein coupled to the C-terminus of thioredoxin. In addition, a (His) 6 peptide may be 
engineered into the N-terminus of the fused thioredoxin moiety to afford additional 
opportunities for affinity purification. Other suitable affinity resins could be synthesized by 
linking the appropriate ligands to any suitable resin such as Sepharose-4B. In an alternate 
25 embodiment, a thioredoxin fusion protein may be eluted using dithiothreitol; however, 

elution may be accomplished using other reagents which interact to displace the thioredoxin 
from the resin. These reagents include p-mercaptoethanol or other reduced thiol. The 
eluted fusion protein may be subjected to further purification by traditional means as stated 
above, if desired. Proteolytic cleavage of the thioredoxin fusion protein and the enzyme may 
30 be accomplished after the fusion protein is purified or while the protein is still bound to the 
ThioBond™ affinity resin or other resin. 

Crude, partially purified or purified protein, either alone or as a fusion protein, may be 
utilized in assays for the evaluation of compounds for their ability to inhibit enzymatic 
activition of the cell cycle regulatory proteins disclosed herein. Assays may be conducted 
35 under well known experimental conditions which permit optimal protein activity. For 

example, assays for E2F and DP-1 transcription factor activities are presented by Helin K., 
et al., 1994 Genes Dev. 70:1850-1861 and Ivey-Hoyl M., et al., 1993 Mol Cell Biol 
73(2;:7802-7812. Assays for DP-2 activity are presented by Zhang Y., et al., 1995 
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Oncogene 10(11):20ZS-2Q92>. Assays for CDC- 16 activity are presented by Lamb, J. R., 
etal., 1 994 EMBO J. 13(18):432\~432S. 
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CLAIMS 

What is claimed is: 

1 . An isolated nucleic acid fragment encoding all or a substantial portion of a 
GDC- 16 protein comprising a member selected from the group consisting of: 

5 (a) an isolated nucleic acid fragment encoding all or a substantial portion of 

the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO:2 and 4; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of 

10 the amino acid sequence set forth in a member selected from the group 

consisting of SEQ ID NO:2 and 4; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

2. The isolated nucleic acid fragment of Claim 1 wherein the nucleotide sequence 
of the fragment comprises all or a portion of the sequence set forth in a member selected 

15 from the group consisting of SEQ ID NO: 1 and 3. 

3. A chimeric gene comprising the nucleic acid fragment of Claim 1 operably 
linked to suitable regulatory sequences. 

4. A transformed host cell comprising the chimeric gene of Claim 3. 

5. A CDC-1 6 polypeptide comprising all or a substantial portion of the amino 

20 acid sequence set forth in a member selected from the group consisting of SEQ ID NO:2 and 
4. 

6. An isolated nucleic acid fragment encoding ail or a substantial portion of a 
DP-1 protein comprising a member selected from the group consisting of: 

(a) an isolated nucleic acid fragment encoding all or a substantial portion of 
25 the amino acid sequence set forth in a member selected from the group 

consisting of SEQ ID NO:6, 8 and 10; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of 
the amino acid sequence set forth in a member selected from the group 

30 consisting of SEQ ID NO:6, 8 and 10; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

7. The isolated nucleic acid fragment of Claim 6 wherein the nucleotide sequence 
of the fragment comprises all or a portion of the sequence set forth in a member selected 
from the group consisting of SEQ ID NO:5, 7 and 9. 
35 8. A chimeric gene comprising the nucleic acid fragment of Claim 6 operably 

linked to suitable regulatory sequences. 

9. A transformed host cell comprising the chimeric gene of Claim 8. 
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10. A DP-1 polypeptide comprising all or a substantial portion of the amino acid 
sequence set forth in a member selected from the group consisting of SEQ ID NO:6, 8 and 
10. 

1 1 . An isolated nucleic acid fragment encoding all or a substantial portion of a 
5 DP-2 protein comprising a member selected from the group consisting of: 

(a) an isolated nucleic acid fragment encoding all or a substantial portion of 
the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO:12 and 14; 

(b) an isolated nucleic acid fragment that is substantially similar to an 

10 isolated nucleic acid fragment encoding all or a substantial portion of 

the amino acid sequence set forth in a member selected from the group 
consisting of SEQ ID NO: 12 and 14; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

12. The isolated nucleic acid fragment of Claim 1 1 wherein the nucleotide 

15 sequence of the fragment comprises all or a portion of the sequence set forth in a member 
selected from the group consisting of SEQ ID NO:l 1 and 13. 

13. A chimeric gene comprising the nucleic acid fragment of Claim 1 1 operably 
linked to suitable regulatory sequences. 

14. A transformed host cell comprising the chimeric gene of Claim 13. 

20 1 5. A DP-2 polypeptide comprising all or a substantial portion of the amino acid 

sequence set forth in a member selected from the group consisting of SEQ ID NO: 12 and 14. 

1 6. An isolated nucleic acid fragment encoding all or a substantial portion of a E2F 
protein comprising a member selected from the group consisting of: 

(a) an isolated nucleic acid fragment encoding all or a substantial portion of 
25 the amino acid sequence set forth in a member selected from the group 

consisting of SEQ ID NO: 16 and 18; 

(b) an isolated nucleic acid fragment that is substantially similar to an 
isolated nucleic acid fragment encoding all or a substantial portion of 
the amino acid sequence set forth in a member selected from the group 

30 consisting of SEQ ID NO: 16 and 18; and 

(c) an isolated nucleic acid fragment that is complementary to (a) or (b). 

17. The isolated nucleic acid fragment of Claim 16 wherein the nucleotide 
sequence of the fragment comprises all or a portion of the sequence set forth in a member 
selected from the group consisting of SEQ ID NO: 1 5 and 1 7. 

35 1 8. A chimeric gene comprising the nucleic acid fragment of Claim 16 operably 

linked to suitable regulatory sequences. 

1 9. A transformed host cell comprising the chimeric gene of Claim 1 8. 

20. An E2F polypeptide comprising all or a substantial portion of the amino acid 
sequence set forth in a member selected from the group consisting of SEQ ID NO: 1 6 and 18. 
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10 



15 



20 



21 . A method of altering the level of expression of a cell cycle regulatory protein in 

a host cell comprising: 

(a) transforming a host cell with the chimeric gene of any of Claims 3, 8, 

13 and 18; and 

(b) growing the transformed host cell produced in step (a) under conditions 
that are suitable for expression of the chimeric gene 

wherein expression of the chimeric gene results in production of altered levels of a cell cycle 
regulatory protein in the transformed host cell. 

22. A method of obtaining a nucleic acid fragment encoding all or a substantial 
portion of the amino acid sequence encoding a cell cycle regulatory protein comprising: 

(a) probing a cDNA or genomic library with the nucleic acid fragment of 
any of Claims 1, 6, 1 1 and 16 ; 

(b) identifying a DNA clone that hybridizes with the nucleic acid fragment 
of any of Claims 1, 6, 1 1 and 16; 

(c) isolating the DNA clone identified in step (b); and 

(d) sequencing the cDNA or genomic fragment that comprises the clone 
isolated in step (c) 

wherein the sequenced nucleic acid fragment encodes all or a substantial portion of the 
amino acid sequence encoding a cell cycle regulatory protein. 

23. A method of obtaining a nucleic acid fragment encoding a substantial portion 
of an amino acid sequence encoding a cell cycle regulatory protein comprising: 



(a) synthesizing an oligonucleotide primer corresponding to a portion of the 
sequence set forth in any of SEQ ID NOs:l, 3, 5, 7, 9, 1 1, 13, 15, and 
17; and 

(b) amplifying a cDNA insert present in a cloning vector using the 
oligonucleotide primer of step (a) and a primer representing sequences 
of the cloning vector 



wherein the amplified nucleic acid fragment encodes a substantial portion of an amino acid 
sequence encoding a cell cycle regulatory protein. 

24. The product of the method of Claim 22. 

25. The product of the method of Claim 23. 

26. A method for evaluating at least one compound for its ability to inhibit the 
activity of a cell cycle regulatory protein, the method comprising the steps of: 

(a) transforming a host cell with a chimeric gene comprising a nucleic acid 
fragment encoding a cell cycle regulatory protein operably linked to 
suitable regulatory sequences; 

(b) growing the transformed host cell under conditions that are suitable for 
expression of the chimeric gene wherein expression of the chimeric 
gene results in production of the cell cycle regulatory protein encoded 
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by the operably linked nucleic acid fragment in the transformed host 
cell; 

(c) optionally purifying the cell cycle regulatory protein expressed by the 
transformed host cell; 

5 (d) treating the cell cycle regulatory protein with a compound to be tested; 

and 

(e) comparing the activity of the cell cycle regulatory protein that has been 
treated with a test compound to the activity of an untreated cell cycle 
regulatory protein, 
10 thereby selecting compounds with potential for inhibitory activity. 
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SEQUENCE LISTING 



<110> E. I. DU PONT DE NEMOURS AND COMPANY 

<120> CELL CYCLE REGULATORY PROTEINS 

<130> BB-1161 

<140> 
<141> 

<150> 60/081, 132 
<151> APRIL 9, 1998 

<160> 18 

<170> Microsoft Office 97 

<210> 1 

<211> 507 

<212> DNA 

<213> Z-ea mays 

<220> 

<221> unsure 
<222> (333) 

<220> 

<221> unsure 
<222> (364) 

<220> 

<221> unsure 
<222> (403) 

<220> 

<221> unsure 
<222> (426) 

<220> 

<221> unsure 
<222> (482) 

<400> 1 cn 

gagaccaccg atccgccgtc ggagatgagg gaggaggcgc tggagcggct gcgcggggtg 60 

gtacgggaca gcgccgggaa gcacctctac acgtcggcca tcttcctcgc cgacaaggtg 120 

gcggcggcca cgggggaccc tggcgacgtc tacatgctcg cgcaggcact cttcctcggt 180 

cgtcaattcc gtcacgcgct ccacctcctc aacaattctc gcctgctccg cgacctccgg 240 

ttcagattcc tcgccgccaa gtgcctcgag gagttgaaag agtggcatca gtgtttgttg 300 

atgcttgggg atgcaaaagt ggatgagcat ggnaatgtcc tttgatcatg atgatgacag 360 

tganatttat tttgataagg acgcggaaag atcatgagat cantatcaaa tcagctctat 420 

gtttcntacg tggtaaaggc atatgaagca ctgggacaac cgtgatcttg ctcgccaatg 480 
gnacaaaggc agctgttaaa gccgatg 

<210> 2 

<211> 89 

<212> PRT 

<213> Zea mays 

<400> 2 

Leu Glu Arg Leu Arg Gly Val Val Arg Asp Ser Ala Gly Lys His Leu 



1 



5 



10 



1 
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Tyr Thr Ser Ala lie Phe Leu Ala Asp Lys Val Ala Ala Ala Thr Gly 
20 25 30 

Asp Pro Gly Asp Val Tyr Met Leu Ala Gin Ala Leu Phe Leu Glv Ara 
35 40 45 

Gin Phe Arg His Ala Leu His Leu Leu Asn Asn Ser Arg Leu Leu Ara 
50 55 60 

Asp Leu Arg Phe Arg Phe Leu Ala Ala Lys Cys Leu Glu Glu Leu Lys 
65 ™ 75 80 

Glu Trp His Gin Cys Leu Leu Met Leu 
85 

<210> 3 

<211> 1093 

<212> DNA 

<213> Glycine max 



<400> 3 

gcacgaggtt 

acgctagtac 

tcatgcaatt 

tactactatt 

ctagatggaa 

gagggagacc 

ttagcaactc 

cagtttttta 

ggagttgttg 

ttagccttgg 

catgcataca 

gcattatcaa 

gatgatttca 

cagttctgca 

cccagtcttg 

taaatgtgta 

ttatcagttg 

tagcaatttt 

aatttttaat 



ttgagttaac 
atctggcagc 
tggtgaagga 
gtatcaagaa 
catttccccc 
aggctatgtc 
tttacattgg 
cgcaggccaa 
cctaccatat 
taccaaccac 
gaaagctgaa 
caagaagtgt 
ccacagcaat 
ctgaaatgtt 
aattccgttg 
acaatgagtg 
tagataatct 
atactagtta 
att 



aaatgatttg 
tgctgtggag 
ttatcctcag 
gtatgatcaa 
tgcatggata 
tgcttaccgc 
aatggaatgc 
gtcaatttgc 
ggaagagtat 
tttatctgaa 
gatgtaccga 
gagcacttat 
tgcatattat 
aagttgggct 
aagtctgatt 
ccgagttgat 
gtattttgtt 
aatattccgt 



cttgagaagg 
cttgggcatt 
atggctttat 
tcccgccgtt 
ggctatggga 
actgcagcta 
atgcggaccc 
tcctcagatc 
aagaaagcag 
atttgggaat 
gaagcaattt 
gccggtcttg 
cacaaagcct 
ctaatagatg 
tgaatatgat 
gtattgccaa 
tctgaataat 
gacttcacag 



atcttttcca 
caaatgaact 
catggtttgc 
attttagcaa 
atgcttatgc 
gattgtttcc 
acagttataa 
cacttgtgta 
tgtggtggtt 
cgactgtagt 
catattatga 
catatactta 
tgtggctgaa 
aaagtcgaag 
ggtgcagcca 
tgtgcaattg 
gtttgatttt 
gtattaatct 



tctaaagact 
ctatctgatg 
tgttggttgc 
ggcaactagt 
tgctcaagaa 

tgggtgccat 

gcttgctgag 
taatgaactt 
tgagaaaaca 
caatcttgca 
gaaagcactt 
ccacctacag 
accagatgat 
aggcgttgac 
tttgcacttg 
ttaattatta 
attatatctc 
taaaatttgt 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1093 



<210> 4 

<211> 264 

<212> PRT 

<213> Glycine max 



<400> 4 

Phe Glu Leu Thr Asn Asp Leu Leu Glu Lys Asp Leu Phe His Leu Lys 
15 10 15 

Thr Thr Leu Val His Leu Ala Ala Ala Val Glu Leu Gly His Ser Asn 
20 25 30 

Glu Leu Tyr Leu Met Ser Cys Asn Leu Val Lys Asp Tyr Pro Gin Met 
35 40 45 

Ala Leu Ser Trp Phe Ala Val Gly Cys Tyr Tyr Tyr Cys He Lys Lys 
5 0 55 60 

Tyr Asp Gin Ser Arg Arg Tyr Phe Ser Lys Ala Thr Ser Leu Asp Glv 
65 7 0 75 80 

Thr Phe Pro Pro Ala Trp lie Gly Tyr Gly Asn Ala Tyr Ala Ala Gin 
85 90 " 95 
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Glu Glu Gly Asp Gin Ala Met Ser Ala Tyr Arg Thr Ala Ala Arg Leu 
100 105 no 

Phe Pro Gly Cys His Leu Ala Thr Leu Tyr He Gly Met Glu Cys Met 
115 120 125 

Arq Thr His Ser Tyr Lys Leu Ala Glu Gin Phe Phe Thr Gin Ala Lys 
130 135 140 

Ser He Cys Ser Ser Asp Pro Leu Val Tyr Asn Glu Leu Gly Val Val 
145 150 155 160 

Ala Tyr His Met Glu Glu Tyr Lys Lys Ala Val Trp Trp Phe Glu Lys 
165 170 1 ? 5 

Thr Leu Ala Leu Val Pro Thr Thr Leu Ser Glu He Trp Glu Ser Thr 
180 185 190 

Val Val Asn Leu Ala His Ala Tyr Arg Lys Leu Lys Met Tyr Arg Glu 

200 205 



195 



Ala He Ser Tyr Tyr Glu Lys Ala Leu Ala Leu Ser Thr Arg Ser Val 
210 " ~ 215 220 

Ser Thr Tyr Ala Gly Leu Ala Tyr Thr Tyr His Leu Gin Asp Asp Phe 
225 230 235 240 

Thr Thr Ala He Ala Tyr Tyr His Lys Ala Leu Trp Leu Lys Pro Asp 
245 250 255 

Asp Gin Phe Cys Thr Glu Met Leu 
2 60 



<210> 5 

<211> 665 

<212> DNA 

<213> Impatiens balsamia 



<220> 

<221> unsure 
<222> (199) 



<220> 

<221> unsure 
<222> (414) 



<220> 

<221> unsure 
<222> (416) 



<220> 

<221> unsure 
<222> (478) 



<220> 

<221> unsure 
<222> (519) 



<220> 

<221> unsure 
<222> (548) 
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<220> 

<221> unsure 
<222> (554) 

<220> 

<221> unsure 
<222> (569) 

<220> 

<221> unsure 
<222> (578) 

<220> 

<221> unsure 
<222> (598) 

<220> 

<221> unsure 
<222> (611) 

<220> 

<221> unsure 
<222> (617) . . (618) 

<220> 

<221> unsure 
<222> (632) 

<220> 

<221> unsure 
<222> (635) 

<220> 

<221> unsure 
<222> (637) 

<220> 

<221> unsure 
<222> (647) 

<220> 

<221> unsure 
<222> (657) 

<400> 5 

agaagaaggg tatatgatgc actcaatgta ctcatggcca tggatataat atccaaggac 60 
aaaaaggaaa ttcaatggaa gggtcttcca aggactagct tgaatgacat tgaagagata 120 
aaggcagagc gtctagggct gagaagcaga attgacaaaa aaacagctta tttgcaagaa 180 
cttgaggaac attatgtang tcttcagaat cttgtacaaa ggaatgagcg gttatatggc 240 
tcagggaaca tgccgactgg aggtgtggct ttgcctttta tacttgtgca gacacgaccc 300 
catgcaaccg tcgaaataga aatatcggaa gatatgcagc tggttcattt cgacttcaac 360 
agcactcctt ttgagctaca cgacgataat tatgttatga gggcaatgaa attnancgga 420 
ataaacgatg gtgaatcaat gccaacttac cgacgatgga ggtgaaggcc aagtattnca 480 
agccaacgtc aaggcaaacg aaattgcgtg ctctcttcnt aacttcgaca tagaatttgt 540 
tcgtttgnca cacnctaaat gctggtcgnt ggtctgtnac tgtaacgaaa tccgttantt 600 
gatatctggt naaaatnntg caggattggg gnganancca cctgaancta accttancaq 660 
gattg y 665 

<210> 6 

<211> 138 

<212> PRT 

<213> Impatiens balsamia 
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<220> 

<221> UNSURE 
<222> (67) 



<220> 

<221> UNSURE 

<222> (86) . . (87) 

<400> 6 



Arq Arq Arg Val Tyr Asp Ala Leu Asn Val Leu Met Ala Met Asp He 
1 J 5 10 . 15 

He Ser Lys Asp Lys Lys Glu He Gin Trp Lys Gly Leu Pro Arg Thr 
20 ~ 25 30 

Ser Leu Asn Asp He Glu Glu He Lys Ala Glu Arg Leu Gly Leu Arg 
35 * 40 45 

Ser Arg He Asp Lys Lys Thr Ala Tyr Leu Gin Glu Leu Glu Glu His 
50 ^ 55 60 

Tvr Val Xaa Leu Gin Asn Leu Val Gin Arg Asn Glu Arg Leu Tyr Gly 
65 "70 75 80 

Ser Gly Asn Met Pro Xaa Xaa Pro Thr Gly Gly Val Ala Leu Pro Phe 
85 90 95 

He Leu Val Gin Thr Arg Pro His Ala Thr Val Glu He Glu He Ser 
100 105 HO 

Glu Asp Met Gin Leu Val His Phe Asp Phe Asn Ser Thr Phe Glu Leu 
115 120 125 

His Asp Asp Asn Tyr Val Met Arg Ala Met 
130 135 



<210> 


7 




<211> 


296 




<212> 


DNA 




<213> 


Zea mays 




<400> 


7 




gatattgaag aattgaagac 


tgagcttgtg 


gtttacttac aggagctaca 


agatcaatat 


gagcaattat atggttcagg 


aaacacacct 


gtccagaccc gacctcatgc 


aaccgtggaa 


cattttgact tcaatagcac 


tccatttgag 


<210> 


8 




<211> 


100 




<212> 


PRT 




<213> 


Zea mays 




<220> 






<221> 


UNSURE 




<222> 


(51) . . (52) 




<400> 


8 





60 



296 



Asp He Glu Glu Leu Lys Thr Glu Leu Val Gly Leu Lys Gly Arg He 
15 10 15 

Glu Lys Lys Ser Val Tyr Leu Gin Glu Leu Gin Asp Gin Tyr Val Gly 
20 25 30 
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Leu Gin Asn Leu lie Gin Arg Asn Glu Gin Leu Tyr Gly Ser Gly Asn 
35 4 0 45 

Thr Pro Xaa Xaa Pro Ser Gly Gly Val Ala Leu Pro Phe He Leu Val 
50 55 60 

Gin Thr Arg Pro His Ala Thr Val Glu Val Glu He Ser Glu Asp Met 
65 70 75 80 

Gin Leu Val His Phe Asp Phe Asn Ser Thr Phe Glu Leu His Asp Asp 
8 5 90 95 

Ser Tyr Val Leu 
100 



<210> 9 

<211> 481 

<212> DNA 

<213> Glycine max 

<220> 

<221> unsure 

<222> (37) 

<220> 

<221> unsure 

<222> (82) 

<220> 

<221> unsure 

<222> (85) 

<220> 

<221> unsure 

<222> (110) 

<220> 

<221> unsure 

<222> (112) 

<220> 

<221> unsure 

<222> (118) 

<220> 

<221> unsure 

<222> (153) 

<220> 

<221> unsure 

<222> (155) 

<220> 

<221> unsure 

<222> (169) 

<220> 

<221> unsure 

<222> (171) 

<220> 

<221> unsure 

<222> (183) 
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<220> 

<221> unsure 

<222> (195) 

<220> 

<221> unsure 

<222> (214) 

<220> 

<221> unsure 

<222> (219) 

<220> 

<221> unsure 

<222> (224) 

<220> 

<221> unsure 

<222> (247) 

<220> 

<221> unsure 

<222> (265) 

<220> 

<221> unsure 

<222> (283) 

<220> 

<221> unsure 

<222> (289) 

<220> 

<221> unsure 

<222> (301) 

<220> 

<221> unsure 

<222> (315) . . (316) 

<220> 

<221> unsure 

<222> (323) 

<220> 

<221> unsure 

<222> (359) 

<220> 

<221> unsure 

<222> (361) 

<220> 

<221> unsure 

<222> (376) 

<220> 

<221> unsure 

<222> (407) 

<220> 

<221> unsure 

<222> (426) . . (427) . . (428) 
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<220> 

<221> unsure 
<222> (445) 

<220> 

<221> unsure 
<222> (448) 

<220> 

<221> unsure 
<222> (457) 

<220> 

<221> unsure 
<222> (460) 

<400> 9 

agcaacaata tgatgagaaa aacatccgcc gaagggncga tgatgctctg aacgttctca 60 

tggcaatgga tattatttct antgncaaaa aaaaaattca atggaggggn cntcctcnca 120 

ctactgtgaa tgatattgaa gaactaaaga canancgtct tgggctcang natagaattg 180 

aanagaaaac aaccnatctg cacgagcttg aggngcaant cgtntgtctt cacaacctta 240 

ttcaacnaaa tgagcaagtt atatngctca agaaatcctc ccncccggng ggggggcctt 300 

nccttttttt tggtnnagaa aantccccat gcaactgtgg ggggggggaa tatcaaaana 360 

natgcaacct ggtcantttg atttcaataa aaaacccttt ttttgcntgg cgaaaattaa 420 

tttttnnncg caagggaatt ttgtntgnga ccactgnccn gggtaatatg acacaaaaac 480 

c " 481 

<210> 10 

<211> 83 

<212> PRT 

<213> Glycine max 

<220> 

<221> UNSURE 
<222> (10) 

<220> 

<221> UNSURE 
<222> (25) . . (26) 

<220> 

<221> UNSURE 
<222> (35) 

<220> 

<221> UNSURE 
<222> (37) 

<220> 

<221> UNSURE 
<222> (49) 

<220> 

<221> UNSURE 
<222> (54) . . (55) 

<220> 

<221> UNSURE 
<222> (59) 

<220> 

<221> UNSURE 
<222> (63) 
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<220> 

<221> UNSURE 
<222> (69) 

<220> 

<221> UNSURE 
<222> (71) 

<220> 

<221> UNSURE 
<222> (80) 

<400> 10 

Tyr Asp Glu Lys Asn lie Arg Arg Arg 
1 5 

Leu Met Ala Met Asp lie lie Ser Xaa 
20 25 

Arg Gly Xaa Pro Xaa Thr Thr Val Asn 
35 40 

Xaa Arg Leu Gly Leu Xaa Xaa Arg lie 
50 55 

His Glu Leu Glu Xaa Gin Xaa Val Cys 
65 7 0 

Asn Glu Gin 




PCT/US99/07638 



Xaa Asp Asp Ala Leu Asn Val 
10 15 

Xaa Lys Lys Lys lie Gin Trp 
30 

Asp lie Glu Glu Leu Lys Thr 
45 

Glu Xaa Lys Thr Thr Xaa Leu 
60 

Leu His Asn Leu lie Gin Xaa 
75 80 



<210> 11 

<211> 1193 

<212> DNA 

<213> Zea mays 

<220> 

<221> unsure 
<222> (238) 

<400> 11 

gcgacagcac gttcctccgc ttgaataatc 
cgcaggctcc tacgagcaag aagaaaagga 
gtaaccgggg actgcgccag tttagtatga 
gaacaacata taatgaggtg gcagatgaac 
atattgaggc accagatcct gataacccta 
gagggcgagt ttatgatgct ttaaatgttc 
aaaaggagat ccagtggaag ggcttgccgc 
agactgagct tgtgggactg aaaggtagaa 
tacaagatca atatgtaggt ttgcaaaacc 
caggaaacac accttctggt ggagtggctt 
atgctaccgt ggaagttgag atatcagaag 
gcaccccatt cgagctgcac gacgactcat 
gagaacaaca tgacagcact caagagtcga 
caaatattta ttggcaacac gtacagcatg 
taccgagctc accgcctatt ccagggatct 
gctagggtgt tggttcactt tccttttcgt 
ttgtgtaaag ggccctgtaa attattaggc 
tctgcaccag attggtagaa cgacgggtgc 
tgaatcggtt tctttagtat ggttgagaga 
taaaattaaa attgttgatt tctatcaggg 

<210> 12 
<211> 194 



tcgacatcaa cggcgacgac gcgccgtcgt 60 

gaggcacacg ggcagtgggt cctgataaag 120 

aagtttgtga gaaagttgaa agtaaaggga 180 

ttgttgctga gtttacagac cccaacanta 240 

acgcgcaaca atatgatgag aaaaatatac 300 

tgatggctat ggacattata tctaaagata 360 

ggactagtat aagtgatatt gaagaattga 420 

ttgagaagaa aagtgtttac ttacaggagc 480 

tgattcaacg aaatgagcaa ttatatggtt 540 

tgccattcat cctagtccag acccgacctc 600 

atatgcagct ggtgcatttt gacttcaata 660 

acgtcctaaa agaaatgcga ttctgtggaa 720 

tatcaaatgg aggtgagagc tcaagcgtgt 780 

tggaaaggcc aaacaatggc acaggtaggt 840 

tgaaagggcg tgcgaagcac gagcactaac 900 

ctgagcagtt ggttttattg catttctccg 960 

aagcgggagc gtagcttgat ctaatttagc 1020 

tctaagtagt tgtgtaacta taactatcct 1080 

agggttgaca tgtaattttg tagagccttg 1140 

tgaaatttct gggcacaaaa aaa 1193 
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<212> PRT 
<213> Zea mays 

<220> 

<221> UNSURE 
<222> (42) 

<400> 12 

Asp Lys Gly Asn Arg Gly Leu Arg Gin Phe Ser Met Lys Val Cys Glu 
15 10 15 

Lys Val Glu Ser Lys Gly Arg Thr Thr Tyr Asn Glu Val Ala Asp Glu 
20 25 30 

Leu Val Ala Glu Phe Thr Asp Pro Asn Xaa Asn lie Glu Ala Asn Ala 
35 40 45 

Gin Gin Tyr Asp Glu Lys Asn lie Arg Gly Arg Val Tyr Asp Ala Leu 
50 55 60 

Asn Val Leu Met Ala Met Asp lie lie Ser Lys Asp Lys Lys Glu lie 
65 70 75 80 

Gin Trp Lys Gly Leu Pro Arg Thr Ser lie Ser Asp lie Glu Glu Leu 
85 90 * 95 

Lys Thr Glu Leu Val Gly Leu Lys Gly Arg lie Glu Lys Lys Ser Val 
100 105 " 110 

Tyr Leu Gin Glu Leu Gin Asp Gin Tyr Val Gly Leu Gin Asn Leu lie 
115 120 125 

Gin Arg Asn Gin Arg Asn Glu Gin Leu Tyr Gly Ser Gly Asn Thr Pro 
130 135 140 

Ser Gly Gly Val Ala Leu Pro Phe lie Leu Val Gin Thr Arg Pro His 
145 150 155 ~ 160 

Ala Thr Val Glu Val Glu lie Ser Glu Asp Met Gin Leu Val His Phe 
165 170 175 

Asp Phe Asn Ser Thr Phe Glu Leu His Asp Asp Ser Tyr Val Leu Lys 
180 185 * " 190 

Glu Met 

<210> 13 
<211> 698 
<212> DNA 

<213> Triticum aestivum 
<400> 13 

gcacgaggtt tgagtgatat tgataaattg aagactgagg tcattgggct gaaaggtaga 60 

attgacaaga aaagtgcata tctgcaggaa ttacaagatc aatatgcggg cctccaaaat 120 

ttggtagagc gaaatgagca gctatatggt tcgggagatg ctccatctgg cggagtggcc 180 

ctgccattca tattggttca gacacgtcct catgcaactg tcgaagtgga gatatcagaa 240 

gatatgcagt tggtgcattt tgatttcaat agcactccgt ttgagttgca cgatgattcc 300 

tttgtattga aagcaatggg gttctctggt aaagaagaaa ctgacggtac agtggctctg 360 

gttgcaaatg cggttgaatg ctcaagtgca tcaaatgttt atgggcgtcg atcaccacaa 420 

cctgcaaggc caaatggaat taggctacga acctcacctc ctattccagg gatactgaaa 480 

gggcgtgtca agcatgaaca ctaggcacta aaatgggttg ttaatgtttt gagagctact 540 

tggatttata ctcccttggc ttctgtaact taacatgtaa aaaggcttgt aaattatgtc 600 

10 
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catggggaaa tgtgatgttc taatttagct ttgcagccga tggtaggctg atcggcatgt 660 
gaagctccag gcctgatttt aacttttagc tcaaaaaa 698 

<210> 14 

<211> 92 

<212> PRT 

<213> Triticum aestivum 



<400> 14 

Arg lie Asp Lys Lys Ser Ala Tyr 
1 ' 5 

Ala Gly Leu Gin Asn Leu Val Glu 
20 

Tyr Gly Ser Gly Asp Ala Pro Ser 
35 4 0 

Leu Val Gin Thr Arg Pro His Ala 
50 55 

Asp Met Gin Leu Val His Phe Asp 
65 70 

Asp Asp Ser Phe Val Leu Lys Ala 
85 



Leu Gin Glu Leu Gin Asp Gin Tyr 
10 15 

Arg Asn Glu Arg Asn Glu Gin Leu 
25 30 

Gly Gly Val Ala Leu Pro Phe He 
45 



Thr Val Glu Val Glu He Ser Glu 
60 



Phe Asn Ser Thr Phe Glu Leu His 
75 80 



Met Gly Phe Ser 
90 



<210> 15 

<211> 540 

<212> DNA 

<213> Oryza sativa 



<220> 

<221> unsure 
<222> (413) 



<220> 

<221> unsure 
<222> (461) 



<220> 

<221> unsure 
<222> (471) 



<220> 

<221> unsure 

<222> (498) 

<220> 

<221> unsure 

<222> (517) 



<400> 15 

ctgaagatga catcaagtct ttgccctgct 
gcgcctcatg gtacaacttt ggaggtccca 
aggagatata ggattgttct aaggagtact 
caatttgagg agatgagtgg catggagact 
gattctctag agaatcccag gacgccattg 
caagccaaat attcaagatg ggctgctaat 
gatattggcg ggatgattga agaattgtcc 
actgggctcc ttatcaaaat tgccgggggg 
aagcaacaaa aaggtggngg tggggaaagg 



tccaagaatc agacactgat cgcaatcaaa 60 
gatcctgatg aagtgaatga ttatccacag 120 
atgggtccaa tagatgtcta cctagttagt 180 
cctccaagga ctgtgcagcc agtaagcatg 240 
gctgcagaac ccaacaaagc tgcagagtca 300 
ggccttctga tgctccttct agttcacaag 360 
cttcagaaac tttgataccg atngcagact 420 
ttagcaatta nccgacatgt ngggaagaac 480 
gaatccnaaa aaattcaatt gcaagagggg 540 
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m 
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<210> 
<211> 
<212> 
<213> 

<220> 
<221> 
<222> 



16 
53 
PRT 

Oryza sativa 



UNSURE 
(31) 



<400> 16 

Leu Cys Pro Ala Ser Lys Asn Gin Thr Leu lie Ala lie Lys Ala Pro 
1 5 10 15 

His Gly Thr Thr Leu Glu Val Pro Asp Pro Asp Glu Val Asn Xaa Gin 
20 25 ~ 30 

Arg Arg Tyr Arg He Val Leu Arg Ser Thr Met Gly Pro He Asp Val 
35 40 45 

Tyr Leu Val Ser Gin 
50 

<210> 17 

<211> 1600 

<212> DNA 

<213> Glycine max 



<400> 17 

gtacgagcga 

acttacaaca 

aaccgagaca 

agacggcgga 

gcaaaaaatc 

aaggaagagg 

gtttccgacg 

agtgacaagt 

gaaaaatctc 

gaattgatct 

gtaatgagaa 

cttattgaga 

gaagggaaga 

gcgtttggaa 

agtggggact 

gaggcagacg 

ggtcctttcg 

caggtgcatg 

ttgagagaac 

aggaagaggc 

gatttagtaa 

gtaaacttaa 

tatctgtagt 

tcaaccgtga 

tttgtttgtt 

cctgtactgc 

aaaaaaaaaa 



atcttcttca 
gaaaacaaaa 
ccgttcactt 
tctatgacat 
aatacacatg 
ggttgaagga 
acgaagatga 
tgaaccctaa 
tggctctgct 
cccttgacga 
ctaaagttag 
agacccatac 
catgggatga 
gtgatatcac 
tgaat ccgaa 
aaaacaattt 
ctccagcttg 
actggggcag 
tcttctccca 
cattacaaat 
aatttagtct 
gtttgcaaat 
atgcattctc 
tgttgaccga 
gtatagtttc 
agtctggagc 
aaaaaaaaaa 



ttacatggct 
atctctaggc 
gattggccta 
tgtgaatgtc 
gagagggttt 
gaattctaat 
cgaggaaaca 
ttctactctt 
tacccagaat 
ggctgcaaaa 
acgcctatat 
aatggataca 
gacacttca-c 
taacatcagt 
tcctaagaag 
aaaacaaggc 
cgtgcccaaa 
tcttgcaact 
ttacatggaa 
tttgtaaacg 
ggaaacttgg 
tttaatccaa 
ttcatgcgtt 
ctgaatcagc 
taccctttat 
tgctaattta 
aaaaaaaaaa 



tcttcagatc 
cttctctgca 
gacgatgctg 
ctggagagta 
gccgcaatcc 
tccctacgtg 
cagtccaatc 
cctaaacctc 
tttgtcaagc 
ttgctacttg 
gacatcgcaa 
cgaaaaccag 
aaatcaaacc 
tttgagagga 
ccaagaatgg 
ataaaacaag 
gttggagcct 
gcacatagcc 
gcatggaaat 
tatgccaaag 
tttcagagaa 
acatggggtg 
tgtagagctg 
ttaggcagtt 
ttaatgtagc 
aggaaaataa 
aaaaaaaaaa 



ccatctcttc 
ctaatttctt 
ccacccgatt 
tcggggtact 
ctctcactct 
ggcctggaaa 
ccgccgctac 
tgaaaaatga 
tctttgtctg 
gggatgccca 
atgtgctatc 
cattcaggtg 
taaacgactc 
ataaagtgga 
aaaatggtag 
cttcaaagag 
ctcagaataa 
ctcagtatca 
tgtggtactc 
tcaaactata 
gcaagacttg 
taactttcct 
agcttccatt 
cttgcaaatt 
aggaaatctt 
aattcgagtt 



tcgccactac 
gagcttgtac 
aggtgttgag 
ctcaaggaaa 
acaggatctc 
ccatgataaa 
tggaagtcaa 
aaatcgaaga 
ctctaatgtg 
taatacgtct 
ctccatgaac 
gctcggttct 
taggaaaagg 
attgttcacg 
tgggctgggt 
ctatgaattt 
taatatgaag 
aaatgaagcc 
agaaattgct 
ggtatgtttg 
ggatactagt 
cactagagaa 
tactgccttt 
atttatcaca 
gtgcactact 
gttttaaaaa 



<210> 18 

<211> 80 

<212> PRT 

<213> Glycine max 

<400> 18 

Met Ala Ser Ser Asp Pro He Ser Ser Arg His Tyr Thr Tyr Asn Arg 

1 5 10 15 
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Asn Arg Asp Thr Val His Leu lie Gly 
35 40 

Leu Gly Val Glu Arg Arg Arg lie Tyr 
50 55 

Ser lie Gly Val Leu Ser Arg Lys Ala 
65 70 
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Thr Asn Phe Leu Ser Leu Tyr 
30 



Leu Asp Asp Ala Ala Thr Arg 
45 

Asp lie Val Asn Val Leu Glu 
60 



Lys Asn Gin Tyr Thr Trp Arg 
75 80 
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